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1. Introduction 



The Erdos-Renyi ensemble [201 HI] is a law of a random graph on N vertices, in which each edge is chosen 
independently with probability p = p{N). The corresponding adjacency matrix is called the Erdos-Renyi 
matrix. Since each row and column has typically pN nonzero entries, the matrix is sparse as long as 
p <C 1. We shall refer to pN as the sparseness parameter of the matrix. In the companion paper [TT], we 
established the local semicircle law for the Erdos-Renyi matrix for pN ^ (logiV)*^, i.e. we showed that, 
assuming pN > (log TV) the eigenvalue density is given by the Wigner semicircle law in any spectral 
window containing on average at least (logA^)'^ eigenvalues. In this paper, we use this result to prove 
both the bulk and edge universalities for the Erdos-Renyi matrix under the restriction that the sparseness 
parameter satisfies 

pN > iV^/^ (1.1) 

More precisely, assuming that p satisfies (|1.1[) . we prove that the eigenvalue spacing of the Erdos-Renyi 
graph in the bulk of the spectrum has the same distribution as that of the Gaussian orthogonal ensemble 
(GOE). In order to outline the statement of the edge universality for the Erdos-Renyi graph, wc observe 
that, since the matrix elements of the Erdos-Renyi ensemble are cither or 1, they do not satisfy the mean 
zero condition which typically appears in the random matrix literature. In particular, the largest eigenvalue 
of the Erdos-Renyi matrix is very large and lies far away from the rest of the spectrum. We normalize the 
Erdos-Renyi matrix so that the bulk of its spectrum lies in the interval [—2, 2]. By the edge universality of 
the Erdos-Renyi ensemble, we therefore mean that its second largest eigenvalue has the same distribution as 
the largest eigenvalue of the GOE, which is the well-known Tracy- Widom distribution. We prove the edge 
universality under the assumption (jl.ip . 

Neglecting the mean zero condition, the Erdos-Renyi matrix becomes a Wigner random matrix with a 
Bernoulli distribution when < p < 1 is a constant independent of N. Thus for p <C 1 wc can view the 
Erdos-Renyi matrix, up to a shift in the expectation of the matrix entries, as a singular Wigner matrix 
for which the probability distributions of the matrix elements are highly concentrated at zero. Indeed, the 
probability for a single entry to be zero is 1 ~ p. Alternatively, we can express the singular nature of the 
Erdos-Renyi ensemble by the fact that the k-th moment of a matrix entry is bounded by 

7V-\piV)-('^-2)/2 . (1.2) 

For p <^ I this decay in k is much slower than in the case of Wigner matrices. 

There has been spectacular progress in the understanding of the universality of eigenvalue distributions 
for invariant random matrix ensembles [SJ [71 |51 [571 [2H]- The Wigner and Erdos-Renyi matrices are not 
invariant ensembles, however. The moment method (311 [331 [32] is a powerful means for establishing edge 
universality. In the context of sparse matrices, it was applied in |32| to prove edge universality for the zero 
mean version of the d-regular graph, where the matrix entries take on the values —1 and 1 instead of and 1. 
The need for this restriction can be ascribed to the two following facts. First, the moment method is suitable 
for treating the largest and smallest eigenvalues. But in the case of the Erdos-Renyi matrix, it is the second 
largest eigenvalue, not the largest one, which behaves like the largest eigenvalue of the GOE. Second, the 
modification of the moment method to matrices with non-symmetric distributions poses a serious technical 
challenge. 

A general approach to proving the universality of Wigner matrices was recently developed in the series 
of papers [HI [T31 HH HH [HI [TTl [TS] [TO]. In this paper, we further extend this method to cover sparse 
matrices such as the Erdos-Renyi matrix in the range (jl.ip . Our approach is based on the following three 
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ingredients. (1) A local semicircle law - a precise estimate of the local eigenvalue density down to energy 
scales containing around (logiV)*^ eigenvalues. (2) Establishing universality of the eigenvalue distribution of 
Gaussian divisible ensembles, via an estimate on the rate of decay to local equilibrium of the Dyson Brownian 
motion [^. (3) A density argument which shows that for any probability distribution of the matrix entries 
there exists a Gaussian divisible distribution such that the two associated Wigner ensembles have identical 
local eigenvalue statistics down to the scale 1/A''. In the case of Wigner matrices, the edge universality can 
also be obtained by a modification of (1) and (3) [19]. The class of ensembles to which this method applies 
is extremely general. So far it includes all (generalized) Wigner matrices under the sole assumption that 
the distributions of the matrix elements have a uniform subexponential decay. In this paper we extend this 
method to the Erdos-Renyi matrix, which in fact represents a generalization in two unrelated directions: (a) 
the law of the matrix entries is much more singular, and (b) the matrix elements have nonzero mean. 

As an application of the local semicircle law for sparse matrices proved in jllj . we also prove the bulk 
universality for generalized Wigner matrices under the sole assumption that the matrix entries have 4 + £ 
moments. This relaxes the subexponential decay condition on the tail of the distributions assumed in 
[171 118[ I19| . Moreover, we prove the edge universality of Wigner matrices under the assumption that the 
matrix entries have 12 + e moments. These results on Wigner matrices are stated and proved in Section 
[7] below. We note that in [3] it was proved that the distributions of the largest eigenvalues arc Poisson if 
the entries have at most 4 — e moments. Numerical results [4] predict that the existence of four moments 
corresponds to a sharp transition point, where the transition is from the Poisson process to the determinantal 
point process with Airy kernel. 

We remark that the bulk universality for Hermitian Wigner matrices was also obtained in [34| , partly by 
using the result of [35] and the local semicircle law from Step (1). For real symmetric Wigner matrices, the 
bulk universality in |34| requires that the first four moments of every matrix element coincide with those of 
the standard Gaussian random variable. In particular, this restriction rules out the real Bernoulli Wigner 
matrices, which may be regarded as the simplest kind of an Erdos-Renyi matrix (again neglecting additional 
difficulties arising from the nonzero mean of the entries). 

As a first step in our general strategy to prove universality, we proved, in the companion paper jllj . 
a local semicircle law stating that the eigenvalue distribution of the Erdos-Renyi ensemble in any spectral 
window which on average contains at least (logA^)*^ eigenvalues is given by the Wigner semicircle law. As 
a corollary, we proved that the eigenvalue locations are equal to those predicted by the semicircle law, up 
to an error of order {pN)~^. The second step of the strategy outlined above for Wigner matrices is to 
estimate the local relaxation time of the Dyson Brownian motion [15] [16] . This is achieved by constructing 
a pseudo-equilibrium measure and estimating the global relaxation time to this measure. For models with 
nonzero mean, such as the Erdos-Renyi matrix, the largest eigenvalue is located very far from its equilibrium 
position, and moves rapidly under the Dyson Brownian motion. Hence a uniform approach to equilibrium is 
impossible. We overcome this problem by integrating out the largest eigenvalue from the joint probability 
distribution of the eigenvalues, and consider the flow of the marginal distribution of the remaining iV — 1 
eigenvalues. This enables us to establish bulk universality for sparse matrices with nonzero mean under the 
restriction (jl.lj) . This approach trivially also applies to Wigner matrices whose entries have nonzero mean. 

Since the eigenvalue locations are only established with accuracy {pN)~^, the local relaxation time for 
the Dyson Brownian motion with the initial data given by the Erdos-Renyi ensemble is only shown to be 
less than \/{p'^N) ^ 1/A^. For Wigner ensembles, it was proved in [T^] that the local relaxation time is 
of order 1/A'^. Moreover, the slow decay of the third moment of the Erdos-Renyi matrix entries, as given 
in (|1.2p . makes the approximation in Step (3) above less effective. These two effects impose the restriction 
(jl.lj) in our proof of bulk universality. At the end of Section [2] we give a more detailed account of how this 
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restriction arises. The reason for the same restriction's being needed for the edge universahty is different; 
see Section l673l We note, however, that both the bulk and edge universalities are expected to hold without 
this restriction, as long as the graphs are not too sparse in the sense that pN ^ logA'^; for d-regular graphs 
this condition is conjectured to be the weaker pN 3> 1 [30]. A discussion of related problems on d- regular 
graphs can be found in [55]. 

Acknowledgement. We thank P. Sarnak for bringing the problem of universality of sparse matrices to our 
attention. 



2. Definitions and results 



We begin this section by introducing a class oi N x N sparse random matrices A = Apf. Here is a large 
parameter. (Throughout the following we shall often refrain from explicitly indicating A-dependence.) 

The motivating example is the Erdos-Renyi matrix, or the adjacency matrix of the Erdos-Renyi random 
graph. Its entries are independent (up to the constraint that the matrix be symmetric), and equal to 1 with 
probability p and with probability 1 — p. For our purposes it is convenient to replace p with the new 
parameter q = q{N), defined through p = q^ /N . Moreover, we rescale the matrix in such a way that its bulk 
eigenvalues typically lie in an interval of size of order one. 

Thus we are led to the following definition. Let A = (aij) be the symmetric N x N matrix whose entries 
aij arc independent (up to the symmetry constraint aij — aji) and each clement is distributed according to 



with probability 2- 
witli probability 1 — ^ 



= f ^. ' f (2.1) 



Here 7 (1 — q^ /N)~^/'^ is a scaling introduced for convenience. The parameter q ^ A^^/^ expresses the 
sparseness of the matrix; it may depend on A^. Since A typically has q^N nonvanishing entries, we find that 
if g ^ A^^/^ then the matrix is sparse. 

We extract the mean of each matrix entry and write 

A = H + -fq\e){e\, 

where the entries of H (given by hij = aij — ^q/N) have mean zero, and we defined the vector 

e ^ BAT := ^(1,...,1)^. (2.2) 
V A 



Here we use the notation |e)(e| to denote the orthogonal projection onto e, i.e. {\e){e\)ij := A" ^. 
One readily finds that the matrix elements of H satisfy the moment bounds 

E/i?- = — , E|/i„-r — 3_ (2.3) 

where p ^ 2. 

More generally, we consider the following class of random matrices with non-centred entries characterized 
by two parameters q and /, which may be A^-dependent. The parameter q expresses how singular the 
distribution of hij is; in particular, it expresses the sparseness of A for the special case p.ip . The parameter 
/ determines the nonzero expectation value of the matrix elements. 
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Definition 2.1 (H). We consider NxN random matrices H = (hij) whose entries are real and independent 
up to the symmetry constraint hij ~ hji . We assume that the elements of H satisfy the moment conditions 

Eh,, ^ 0, ^ -, E\h,,\P ^ (2.4) 

for 1 ^ i,j ^ N and 2 ^ p ^ (log7V)^'''°^'°^^, where C is a positive constant. Here q = q{N) satisfies 

(log TV) 15 '°s log w <^ ^ ^ (2.5) 

for some positive constant C . 

Definition 2.2 (A). Let H satisfy Definition \2.1\ Define the matrix A = [aij) through 

A := i? + /|e)(e|, (2.6) 

where f = f{N) is a deterministic number that satisfies 

1 + eo ^ f ^ N"", (2.7) 

for some constants eo > and C . 

Remark 2.3. For definiteness, and bearing the Erdos-Renyi matrix in mind, we restrict ourselves to real 
symmetric matrices satisfying Dcfinition l2.2l However, our proof applies equally to complex Hcrmitian sparse 
matrices. 

Remark 2.4. As observed in [TT], Remark 2.5, we may take iJ to be a Wigncr matrix whose entries have 
subexponential decay E|/i,j |p ^ {Cp^PN-P/^ by choosing q Ari/2(iog Ar)-5eiogiogW_ 

We shall use C and c to denote generic positive constants which may only depend on the constants in 
assumptions such as (|2.4p . Typically, C denotes a large constant and c a small constant. Note that the 
fundamental large parameter of our model is N, and the notations 3>, O(-), o(-) always refer to the limit 
N oo. Here a <^ b means a = o{b). We write a ~ 6 for C~^a ^ 6 ^ Ca. 

After these preparations, we may now state our results. They concern the distribution of the eigenvalues 
of A, which we order in a nondecreasing fashion and denote by /ii ^ • • • ^ /^at. We shall only consider the 
distribution of the — 1 first eigenvalues /zi, . . . , hn-i- The largest eigenvalue /i^r lies far removed from the 
others, and its distribution is known to be normal with mean f + f~^ and variance N~^^'^; see Theorem 
6.2, for more details. 

First, we establish the bulk universality of eigenvalue correlations. Let p(/ii, . . . ,/X7v) be the probability 
densitj{3 of the ordered eigenvalues /zi ^ • • • ^ /^at of A. Introduce the marginal density 

In other words, p^^ is the symmetrized probability density of the first N — \ eigenvalues of H. For 
n N — \ we define the n-point correlation function (marginal) through 

pi^^(Mi, ■ • ■ ,M«) — dfj.n+i---dfiN-ip]^~^\tJ-i,---,tJ'N-i)- (2.8) 



Similarly, we denote by p[^og pf the n-point correlation function of the symmetrized eigenvalue density of an 
A^ X A^ GOE matrix. 



^Note that we use the density of the law of the eigenvalue density for simphcity of notation, but our results remain valid 
when no such density exists. 



5 



Theorem 2.5 (Bulk universality). Suppose that A satisfies Definition \2.S\ with q ^ TV"* for some cj) 
satisfying < ^ 1/2, and that f additionally satisfies f ^ CN^^'^ for some C > 0. Let /? > and assume 
that 

(2.9) 

Let E G (—2,2) and take a sequence (fojv) satisfying N^~^ ^ &Af 5; \\E\ — 2|/2 for some e > 0. Let n G N 
and O : M" — > R &e compactly supported and continuous. Then 



lim / — — / dai---dQ!„0(ai,...,a„) 

'->-°°JE-bN ^"N J 

1 / (ri) _ (ri) \ / 

%.,(£;)"^^^ PGOE,A^j(^- ■ TV^,,,(ij)'---'- ■ Tv^,,,(i;) 



where we abbreviated 



for the density of the semicircle law. 



QsciE) — v/[4-i?2]+ (2.10) 



Remark 2.6. Theorem 12.51 implies bulk universality for sparse matrices provided that 1/3 < ^ 1/2. See 
the end of this section for an account on the origin of the condition (|2.9p . 

We also prove the universality of the extreme eigenvalues. 

Theorem 2.7 (Edge universality). Suppose that A satisfies Definition \2.S\ with q ^ N'^ for some (j) 
satisfying 1/3 < ((> ^ 1/2. Let V be an N x N GOE matrix whose eigenvalues we denote by /\!( ^ ■ ■ ■ ^ A]^. 
Then there is a S > such that for any s we have 



■(iV2/3(A]^-2) < s-iV-*)-A^-'^' P'^(iV2/3^^^_^ _2) <; s) < P^(iV2/3(A]^-2) < s + iV-*)+Ar-'^' 

(2.11) 



as well as 



¥^(^N^^^{XY + 2) s^s- N-^'^-N-^ < P^(iV2/3(^i +2) < s) (^N^/^{X^ + 2) s + N'^'^ + N'^ , 

(2.12) 

for N Nq, where Nq is independent of s. Here P^ denotes the law of the GOE matrix V , and P"^ the law 
of the sparse matrix A. 



Remark 2.8. Theorem 16.41 can be easily extended to correlation functions of a finite collection of extreme 
eigenvalues. 

Remark 2.9. The GOE distribution function Fi{s) ■■= limjv P^(iV2/3(A];^ - 2) < s) of the largest eigenvalue 
of V has been identified by Tracy and Widom [36[ 137] . and can be computed in terms of Painleve equations. 
A similar result holds for the smallest eigenvalue XY of V. 

Remark 2.10. A result analogous to Theorem 12.71 holds for the extreme eigenvalues of the centred sparse 
matrix H; see (|6.15p below. 

We conclude this section by giving a sketch of the origin of the restriction > 1/3 in Theorem 12.51 To 
simplify the outline of the argument, we set /3 = in Theorem 12.51 and ignore any powers of N'^. The proof 
of Theorem 12.51 is based on an analysis of the local relaxation properties of the marginal Dyson Brownian 
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motion, obtained from the usual Dyson Brownian motion by integrating out the largest eigenvalue /i^v- As 
an input, we need the bound 

iV-l 

Q := E^|/i„-7„|2 ^ 7Vi-4^, (2.13) 

where denotes the classical location of the a-th eigenvalue (see p.lSp below). The bound (|2.13p was 
proved in [11]. In that paper we prove, roughly, that \fia — 7q| ^ q^^ ^ N^'^'^, from which (|2.13p follows. 
The precise form is given in p.l6p . We then take an arbitrary initial sparse matrix ensemble Aq and evolve 
it according to the Dyson Brownian motion up to a time r = N~p, for some p > 0. Wc prove that the local 
spectral statistics, in the first iV — 1 eigenvalues, of the evolved ensemble Ar at time r coincide with those 
of a GOE matrix V, provided that 

Qr-i = QNP < 1. (2.14) 
The precise statement is given in (14.91) . This gives us the condition 

l-4(^ + p<0. (2.15) 

Next, we compare the local spectral statistics of a given Erdos-Renyi matrix A with those of the time-evolved 
ensemble Ar by constructing an appropriate initial chosen so that the first four moments of A and Ar 
are close. More precisely, by comparing Green functions, we prove that the local spectral statistics of A and 
Ar coincide if the first three moments of the entries of A and Ar coincide and their fourth moments differ 
by at most 7V~^~* for some S > 0. (See Proposition 15. 21 ) Given A we find, by explicit construction, a sparse 
matrix such that the first three moments of the entries of Ar are equal to those of A, and their fourth 
moments differ by at most N~^~'^'^t = 7V-i-20-p. (j5 gp Thus the local spectral statistics of A and Ar 
coincide provided that 

l-20-p<O. (2.16) 

From the two conditions (|2.15p and (|2.16p we find that the local spectral statistics of A and V coincide 
provided that 4> > 1/3- 



3. The strong local semicircle law and eigenvalue locations 



In this preliminary section we collect the main notations and tools from the companion paper |11| that we 
shall need for the proofs. Throughout this paper we shall make use of the parameter 

^ = --^ 51oglog7V, (3.1) 

which will keep track of powers of logiV and probabilities of high-probability events. Note that in [TT], ^ 
was a free parameter. In this paper we choose the special form (|3.ip for simplicity. 
We introduce the spectral parameter 

Z = E + 17] 

where G M and r) > 0. Let S ^ 3 be a fixed but arbitrary constant and define the domain 

Dl ■■= {z e C : |£;K S, (logiV)^7V-i ^ ?7 ^ 3} , (3.2) 
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with a parameter L = L{N) that always satisfies 

L ^ 8^. (3.3) 
For Im z > we define the Stiehjes transform of the local semicircle law 

ms,(z) := [ ^f^Mdx, (3.4) 



where the density Qsc was defined in (j2.10p . The Stieltjes transform msdz) = nisc may also be characterized 
as the unique solution of 

msc + = (3.5) 

z + rrisc 

satisfying Immsdz) > for Imz > 0. This implies that 



-z + Vz2_4 

Wsc(z) = ^ , (3.6) 

where the square root is chosen so that msc{z) ^ —z~^ as z — > oo. We define the resolvent of A through 

G(z) := {A-z)-\ 
as well as the Stieltjes transform of the empirical eigenvalue density 

m(z) ^TrG(z). 

For a; S R we define the distance to the spectral edge through 

K^, ■■= \\x\ - 2| . (3.7) 

At this point we warn the reader that we depart from our conventions in In that paper, the quantities 
G(z) and m(z) defined above in terms of A bore a tilde to distinguish them from the same quantities defined 
in terms of H. In this paper we drop the tilde, as we shall not need resolvents defined in terms of H. 

We shall frequently have to deal with events of very high probability, for which the following definition 
is useful. It is characterized by two positive parameters, ^ and i', where ^ is given by 

Definition 3.1 (High probability events). We say that an N -dependent event Vt holds with (^,z^)-high 
probability if 

ViVL") ^ e-''('°sA')« (3 8) 

forN^No{y). 

Similarly, for a given event we say that 17 holds with j/)-high probability on J7o */ 
forN^No{y). 

Remark 3.2. In the following wc shall not keep track of the explicit value of v; in fact we allow v to decrease 
from one line to another without introducing a new notation. All of our results will hold for v ^ vq, where 
I'o depends only on the constants C in Definition 12.11 and the parameter S in (|3.2p . 
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Theorem 3.3 (Local semicircle law [H]). Suppose that A satisfies Definition [KM with the condition 
(|2.7p replaced with 

!^ f ^ . (3.9) 

Moreover, assume that 

q ^ (logTV)!^"?, (3.10) 
L ^ 120^ (3.11) 

Then there is a constant u > 0, depending on S and the constants C in (|2.4I) and (|2.5p . such that the 
following holds. 

We have the local semicircle law: the event 

nji'"M-".*'N<'°«"''«HW?-^}+^)) p^'^' 

holds with (^, v)-high probability. Moreover, we have the following estimate on the individual matrix elements 
of G. If instead of (|3.9p / satisfies 

^ f ^ CoN^/\ (3.13) 

for some constant Cq , then the event 



n,axjG,(z) - J,™..(.)| ^ (logA^)-^ (i + f^^^ + ^ ) } (3.14) 



holds with {^,i')-high probability. 

Next, we recall that the — 1 first eigenvalues of A are close the their classical locations predicted by 
the semicircle law. Let nsc{E) := gscix)dx denote the integrated density of the local semicircle law. 
Denote by 7^ the classical location of the a-th eigenvalue, defined through 

nscija) = for a = l,...,iV. (3.15) 

The following theorem compares the locations of the eigenvalues /ii, . . . , /ijv-i to their classical locations 
71, . . . ,7iv-i- 

Theorem 3.4 (Eigenvalue locations [IT]). Suppose that A satisfies Definition \2.'A and let cj) be an 
exponent satisfying < (j> ^ 1/2, and set q = N"^ . Then there is a constant v > Q - depending on S and the 
constants C in ()2.4p . ()2.5p . and (|2.7p - as well as a constant C > such that the following holds. 
We have with {£^,v)-high probability that 

J2 1/^" - 7oP (logiV)^« (iV^-^* + iV4/3-80^ (3_lg) 

Q = l 

Moreover, for all a = 1, . . . , N — 1 we have with (^, v)-high probability that 

W-lA < (l0giV)^«(^A^-2/3jg^-l/3^l|^£^^ (lQg^)C?(i^^l-30)^j ^^2/3-40^^-2/3^^-20^ ^ 

(3.17) 

where we abbreviated a ■= minja, N — a\. 



9 



Remark 3.5. Under the assumption 0^1/3 the estimate p.l7p simphfies to 

W-lo.\ ^ {\ogNfi(N-^'^a-^'-' + N-^^), (3.18) 

which holds with i^)-high probabihty. 

Finahy, we record two basic results from [TT] for later reference. From [TT], Lemmas 4.4 and 6.1, we get, 
with i^)-high probability, 



max |A„| < 2 + (logA^)'^M<?-'+^~^^^ , max 2 + (logiV)^M + iV-'/^ . (3.19) 

Moreover, from [TT], Theorem 6.2, we get, with (^, z^)-high probability, 

p.N = /+y+o(l). (3.20) 

In particular, using ()2.7|) we get, with (^, i/)-high probability, 

2 + cr ^ /iAT s$ , (3.21) 
where cr > is a constant spectral gap depending only on the constant eg from (|2.7p . 



4. Local ergodicity of the marginal Dyson Brownian motion 

In Sections |4] and [5] wc give the proof of Theorem 12.51 Throughout Sections |4] and [5] it is convenient to 
adopt a slightly different notation for the eigenvalues of A. In these two sections wc shall consistently use 
x\ ^ ■ ■ ■ ^ xjv to denote the ordered eigenvalues of A, instead of /^i ^ • • • ^ /xat used in the rest of this 
paper. We abbreviate the collection of eigenvalues by x = (xi, . . . , a; at). 

The main tool in the proof of Theorem 12.51 is the marginal Dyson Brownian motion, obtained from the 
usual Dyson Brownian motion of the eigenvalues x by integrating out the largest eigenvalue ccjy. In this 
section we establish the local ergodicity of the marginal Dyson Brownian and derive an upper bound on its 
local relaxation time. 

Let Aq = {aijfi)ij be a matrix satisfying Definition 12.21 with constants go ^ N'^ and fo^l + £o- Let 
{Bij,t)ij be a symmetric matrix of independent Brownian motions, whose off-diagonal entries have variance 
t and diagonal entries variance 2t. Let the matrix At = {aij_t)ij satisfy the stochastic differential equation 



dS.,, 1 , , _ , 

<^a^J = -j=--aijdt. (4.1) 



It is easy to check that the distribution of At is equal to the distribution of 

e-*/Mo + (l-c-*)i/V, (4.2) 

where V is a GOE matrix independent of Aq. 

Let p be a constant satisfying < p < 1 to be chosen later. In the following we shall consider times t in 
the interval [to , r] , where 

to := N-P-\ T N-P. 
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One readily checks that, for any fixed p as above, the matrix At satisfies Definition 12. 2[ with constants 

ft = f{l + 0{N~'")) ^ l + y, % ~ go ^ 

where all estimates are uniform for t G [to 7 "J"]- Denoting by xjv,t the largest eigenvalue of At, we get in 
particular from p.2ip that 

p(3tG [to,T] :xjv,t^ [2 + a,7V^]] <: (4.3) 

for some tr > and C > 0. 

From now on we shall never use the symbols ft and qt in their above sense. The only information we 
shall need about is (|4.3p . In this section we shall not use any information about qt , and in Section [S] we 
shall only need that qt ^ cN'^ uniformly in t. Throughout this section /( will denote the joint eigenvalue 
density evolved under the Dyson Brownian motion. (See Definition 14.11 below.) 

It is well known that the eigenvalues Xt of At satisfy the stochastic differential equation (Dyson Brownian 
motion) 

d.. = ^+(4., + ^E^)d* for . = 1,...,^, (4.4) 

where Bi, . . . , Bn is a family of independent standard Brownian motions. 
In order to describe the law of V, we define the equilibrium Hamiltonian 

H(x) ^ix2-i-^log|x.-x,| (4.5) 



4 ' N 



and denote the associated probability measure by 

//(^)(dx) = ^(dx) ie-^'^^'^Mx, (4.6) 
Z 

where Z is a normalization. We shall always consider the restriction of /i to the domain 

T,N ■■= {x : xi < • • • < xn} , 

i.e. a factor l(x € Sat) is understood in expressions like the right-hand side of (|4.6p : we shall usually omit 
it. The law of the ordered eigenvalues of the GOE matrix V is 11. 

Define the Dirichlet form and the associated generator L through 

D,{f) = - J f{Lf)df, ■■= |V/pdM, (4.7) 

where / is a smooth function of compact support on Sat. One may easily check that 

L = y—df+y(--x, +—y^—]d., 

2N ' ^\ 4 2N ^ Xi- xj ' 

and that L is the generator of the Dyson Brownian motion (|4.4p . More precisely, the law of Xt is given by 
/4(x) /i(dx), where /( solves dtft = Lft and /o(x)/i(dx) is the law of xq. 
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Definition 4.1. Let ft to denote the solution of dtft = Lft satisfying ft\t=a = fa- It is well known that 
this solution exists and is unique, and that S^v is invariant under the Dyson Brownian motion, i.e. if fo is 
supported in SjV; so is ft for all t ^ 0. For a precise formulation of these statements and their proofs, see 
e.g. Appendices A and B in In AvvendixV^ we present a new, simpler and more general, proof. 

Theorem 4.2. Fixn ^ 1 and letrxi = (mi, . . . ,to„) G N" be an increasing family of indices. LetG : R" — > M 
he a continuous function of compact support and set 

Qi,m{^) G{N{xi - Xi+rni),N{xi+rni " Xi+rn2), ■ • ■,N{Xi+rn„_^ - Xi+rnj) ■ 

Let 7i, . . . ,77v-i denote the classical locations of the first N ^ I eigenvalues, as defined in (|3.15p . and set 

{x,~j,)^ftdfi. (4.8) 



Q ■= sup ^ / (a 



Choose an e > 0. Then for any p satisfying < p < 1 there exists a t Cz [t/2,t] such that, for any 
J C {1, 2, . . . , N — m„ — 1}, we have 



for all N ^ No{p). Here p'^^ is the equilibrium measure of {N — 1) eigenvalues (GOE). 



Note that, by definition, tlie observables Qi.m in (|4.9|) only depend on the eigenvalues xi, . . . , xn-i. 
The rest of this section is devoted to the proof of Theorem 14.21 We begin by introducing a pseudo 
equihbrium measure. Abbreviate 



R := ^/TN-^ = ^-p/2-e/2 

and define 

^ 1 

i=i 

Here we set := 2 + ct for convenience, but one may easily check that the proof remains valid for any larger 
choice of 7Ar. Define the probability measure 

a;(dx) := ?A(x) /i(dx) where ^(x) fe'^^^'^) . 

Z 

Next, we consider marginal quantities obtained by integrating out the largest eigenvalue x^. To that 
end we write 

x = (x,xjv), X = (a;i, . . . ,a;Ar_i) 

and denote by a)(da;) the marginal measure of uj obtained by integrating out xm- By a slight abuse of 
notation, we sometimes make use of functions p, uj, and w, defined as the densities (with respect to Lebesgue 
measure) of their respective measures. Thus, 

/.(x) = lc-^«W, c.(x) ^ ic'Nm.)-NWi.) ^ ^ r ^^^^^^)aXN. 

^ Z JxN-1 
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For any function /i(x) we introduce the conditional expectation 

iZi-i '^^^ 



{h){x) := E"[/l|J] = 



Sj(a;) 



Throughout the foUowing, we write gt ft/^J- In order to avoid pathological behaviour of the extreme 
eigenvalues, we introduce cutoffs. Let a be the spectral gap from (|4.3p . and choose 6*1,^2,^3 G [0, 1] to be 
smooth functions that satisfy 



Oiixi) 

92(xjv-l) 

^'3(2;^) 



ii xi^ -4 

1 if ^ —3 ' 

'1 if aiAT-i 2 + f 

ifxN-i^2 + ^ 

'0 ifa;Ar=^2 + ^ 

1 if ^ 2 + f ■ 



Define 9 ; 



(xi, xjv-i, a; at) = Oi{xi) 6*2(2; jy-i) 6'3(xjv)- One easily finds that 



^ Cl(-4 s$ xi <^ -3) + CI ( ^ s$ XAr_i - 2 s$ y 



CI 



3cr „ 4fT 

— ^ xat - 2 ^ — 



(4.10) 



where the left-hand side is understood to vanish outside the support of 9. 
Define the density 



Zt — / Ogtduj 



ht := i 9gt , 
Zt 

If 1/ is a probability measure and q a density such that qi> is also a probability measure, we define the 
entropy 



q log q dv . 



The following result is our main tool for controlling the local ergodicity of the marginal Dyson Brownian 
motion. 



Proposition 4.3. Suppose that 



(a) sup 

t6[to,-r] 



l(x-i -3) + 1 XN-i ^ 2 + - 



4ct 

1( .Tw s: 2 + — 



(lii) sup sup {ei92){x)\\og{egt){x)\ ^ A^^ 



Then for i e [Iq, t] we have 



(4.11) 
(4.12) 

(4.13) 
(4.14) 
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Proof. First we note that 



Zt = j dftdfi = l-0(e-''('°s^)') (4.15) 
uniformly for t € [to,T], by (|4.12p . Dropping the time index to avoid cluttering the notation, we find 

dtSaiih)) = dt f ^\og{eg)dQ-dtlogZ = Idt f 9g \og{9g) dcu - {l + log Z + Sa{{h))) dtlogZ . 
J Z Z J 

We find that 

1 /2 

dtz = J e{Lf)dfi = j v^?•v/d/i ^ (^1 J\ve\'fdp^ D^{^fY 



/2 



Bounding the Dirichlet form in terms of the entropy (see e.g. [TU], Theorem 3.2), we find that 

^m(v<A) ^ \s^{ft„) ^ N^, (4.16) 
by (|4.1ip . Using (|4.10p we therefore find 

dtZ ^ ^c^-c(iogAf)« ^ (4_;L7) 

Thus we have 

dtSoiih)) ^ 29* [eg \og{0g) dc^ + (l + Sa{{h)))N^c-''^'°s^')' . (4.18) 



dt I eg \og{eg) Auj = / e{Lf) \og{eg) d/i + / {eg) du) . (4.19) 



We therefore need to estimate 

The second term of (|4.19p is given by 

j dt{eg)du = ^e[Lj)dii = dtZ. 

Therefore (|4T8l) yields 

dtS^m) < 2 j e{Lf) \og{eg) d^i + {l + 5s((/i)))iV^e-^('°sA')« . (4.20) 
The first term of (|4.20p is given by 

~h j W-V((?log(0g))dM = V(0/).V(log(0.g))dA. + £i+£2, (4.21) 



where we defined 



fi V0-V(log(05))/d/i, £2 y V0-V/log(05)dM. 
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Next, we estimate the error terms £i and £2- Using (|4.10p we get 

I ^ ^'|2 \ 1/2 / r lY7/fl„\|2 \ 1/2 

|2 \ 1/2 



^ g-i/(logAr)« 

where we used (j4.15p . Similarly, we find 



£2 ^ (^l\Ve\'\\og{9g)\'fdf^y(^P^d^?'^' 



Using (|4?T0)) . (|4?T3| . and (|4TT6)) we therefore get 

■12 „ \ 1/2 



c 



Having deaU with the error terms £1 and £2 , we compute the first term on the right-hand side of ()4.2ip , 

/ V(0/)-V(log(0g))d^ = Vs(0.9)-V£(log(0g))VdM-^ 1 Vs(log V^) • V2(log(0g)) 0.g^ d/. . 

(4.22) 

The second term of ()4.22p is bounded by 
N 



||Vslog^|2/d/i+^ j\l^^{eg)duj ^ rj-^N J ^Y.ix,~J,rfd^^ + Ar,D^{^) 



where ij > 0. 

The first term of (|4.22p is equal to 



{Wsieg))-Ws{log{0g))dQ. 

A simple calculation shows that 

(V£(0.g)) = V£(0.g)-(%V£logc^) + (0g)(V£logcj), 
so that the first term of (|4.22p becomes 

^ ^ V^(^^.g) • V2(log(0.g)) d2 + 1 / ((^gV^logc^) - (Bg) (Vjlogc^)) • V£(log(0g)) dS5 



. ,n ^n^/TM^^l /■ K^ffVelogg.) - {Og) (Vsloga;)|' 
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Using the Cauchy-Schwarz inequality (a&)^ ^ {a?) (5^) we find that the second term is bounded by 



1 



(0g(V£logw- (Vjlogcj))) 



Nri 



jV^logo; - (V^logw)!^^ dcj 



j\V^\oguj~ (V£logw)|^6'/dyU 



N-l 



TYry J ^ ' \ T.r — T 



Thus, we have to estimate 



fa := 



N-l 



xn ~ Xi \ X]\[ — Xi 



N-1 



N-q 



Since Xn — Xi ^ a / 5 on the support of 6fn, one easily gets from p.l9p that 

C 



Xn - Xi 



£3 ^ - 



In order to estimate £"4, we write 



Xn 



J dxN {xn - X^)Wi{xN )\ ^ 
J dXN Wi{XN) 



where 



N 2 N t \2 -1 — r 

w,{xn) ■■= ^xjv-i)e--""-i75^(""-^") [] (xn - Xj) . 

We now claim that on the support of 9, in particular for —4 ^ Xi < xn-i ^ 2 + 2(7/5, we have 

J dxN {xn - Xi)Wi{xN) ^ 
J dXNUJi{XN) 

uniformly for x G Y.n-i- Indeed, writing 'jn '■= 7Ar(l + R^^), we have on the support of 6 

J dXN {xn - Xi)Wi{xN) ^ ~ , J dXN {xn -lN)Wi{xN) 

^ 7Af/2 H 



/ da; AT Wi{xN) 
Moreover, the second term is nonnegative: 

dxN {xn -lN)wi{xN) = -Cn{x) I dxN 



J dXN w^{xn) 



\dxN 



(xjv-7jv)" 



n ~ 



Cw(S)e-i^^(""-^-^")' Y[{xN-i-x,) 



+ Cn{x) 



> 0. 



(4.23) 
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where Cn{x) is nonnegative. This proves (|4.23p . Using (|4.23p we get 



C 



C 



Summarizing, we have proved that 



dtSc^m) ^ -(4-8r;-e-^('°s^)')i?s(yW) + (l + 5s(W))c-^(i°s^)' + ^ + ^. 

Choosing ry small enough completes the proof. 

Next, we derive a logarithmic convexity bound for the marginal measure uj. 
Lemma 4.4. We have that 



u}{x) 



^ g-Af«(£) 



□ 



where 



nix) 



^ \og\x^-Xj\ + V{x), 



i<j<N 



and \/^V{x) ^ R-^. 



Proof. Write 'H{x,xn) ='H'{x) +'H"{x,xn) where 

n'ix) := \og\xi-Xj\, H"{x,xn) -^Y^°s\xn 



N 

i<j<N 

By definition, we have 



uj{x) 



Z 



-NH"[x,xn) 



dxN 



(4.24) 



The main tool in our proof is the Brascamp-Lieb inequality [5]. In order to apply it, we need to extend the 
integration over x^ to M and replace the singular logarithm with a C^-function. To that end, we introduce 
the approximation parameter 5 > and define, for x e 



Vsix) := -^log j exp 



Li<N 



2i?2 



da; 



N ■ 



where we defined 



log5(a;) \{x-^5)\ogx + \{x<5){\og5+'^-^-^{x-5f 
It is easy to check that log^ e C'^iM.), is concave, and satisfies 



limlog5(a;) 



logo; if X > 
—oo if X ^ . 
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Thus wc find that Vg G C (Eyv-i) and that we have the pomtwise convergence, for all a; € S 

\imVs{x) = V{x) -—log / e-^'^ (^'^"Mxat , 

where V e C2(I]jv-i) satisfies (g^H). 

Next, we claim that it ip ~ fix, y) satisfies V^(p(.T, y) ^ K then defined by 

e-'AW := /" e-'^(^'2^) dy , 



satisfies V^ipi^) ^ K. In order to prove the claim, we use subscripts to denote partial derivatives and recall 
the Brascamp-Lieb inequality for log-concave functions (Equation 4.7 in [5]) 



J{'Pxx-fxyfyyfyx)e '^dy 



J e-f dy 
Then the claim follows from 

W =^ [fxx-fxyfyyfyx) ^ K. 

Using this claim, we find that S/'^Vs{x) > for all x G T^n-i- In order to prove that V'^Vixi) ^ ~ 
and hence complete the proof - it suffices to consider directional derivatives and prove the following claim. If 
(C5)i5>o is a family of functions on a neighbourhood U that converges pointwise to a C^-function ^ as (5 — )■ 0, 
and if Csi^) ^ K for ah (5 > and a; e U, then C,"{x) ^ K for all x e U. Indeed, taking J ^ in 

Cs{x + h) + Csix - h) - 2Cs{x) = f\c5{x + 0+C5{x-0)ih-Od^ ^ Kh' 

Jo 

yields (C(a; + h) + ({x — /i) — 2C(a;))/i^^ ^ K, from which the claim follows by taking the limit h ^ 0. □ 

As a first consequence of Lemma 14.41 we derive an estimate on the expectation of observables depending 
only on eigenvalue differences. 

Proposition 4.5. Let q G L°°(dtD) be probability density. Then for any J C {1,2, . . . , N — m„ — 1} and any 
t > we have 



Proof. Using Lemma |4^ the proof of Theorem 4.3 in [16] applies with merely cosmetic changes. □ 
Another, standard, consequence of Lemma 14.41 is the logarithmic Sobolev inequality 

Soiq) CR^D^i^). (4.25) 
Using (|4.25p and Proposition 14.31 we get the following estimate on the Dirichlct form. 
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Proposition 4.6. Under the assumptions of Proposition there exists ar ^ [t/2,t] such that 



SUihr)) < CNR-^Q + CR\ DaiVih^) < CNR-^Q + C. 
Proof. Combining (|4.25p with (|4.14p yields 

dtSaiiht)) ^ -CR-^Scji{ht)) + CNQR-^ + C, (4.26) 
which we integrate from to to t to get 

Ssiiht)) ^ e-^'^^"'^'-'°^SU{hto)) + CNQR-^ + CR\ (4.27) 

Moreover. (|4.15p yields 

= CSM-C f logV'/todM + c-''('°s^)', 



where the second inequality follows from the fact that taking marginals reduces the relative entropy; see the 
proof of Lemma 14.71 below for more details. Thus we get 



Sa{{ht„)) s=: N^ + NR-^Q s$ . 

Thus (|i:77|) yields 

SQiiht)) ^ 7V^e-<^^"'(*-*") + C7Vi?-2Q + Ci?2 
for t e [toiT]. Integrating ()4.14p from r/2 to t therefore gives 

2 r 



DaW{ht))dt ^ CNR-^Q + C, 

r/2 

and the claim follows. 

We may finally complete the proof of Theorem 14.21 



(4.28) 



□ 



Proof of Theorem 14.21 The assumptions of Proposition l4~3l are verified in Subsection 14. II below. Hence 
Propositions 14.51 and 14.61 yield 



Using and (PJ^ we get 



(4.29) 
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In order to compare the measures lj and /i^^ ^\ we define the density 

:= ^ expi + Y^{X^~ -^^f - ^ \og\xN " X,\ \ , 

\i<N i<N i<N ) 

where Z' is a normalization chosen so that 8q dio is a probabihty measure. It is easy to see that 

qdoj ^ d^(^-^' ®dg, 

N 2 N I \2 

where — Ce~~^^~'^^^^~'^^^ dxN is a Gaussian measure. Similarly to Proposition [?3J we have 



\J\ 



Thus we have to estimate 



C ^ ffl n N"^ , „ 1 \ , , 1 

C + NR-' I ^(x.-7,rd/.(^-^) 

^ A ^ AT 



where the second inequality follows from standard large deviation results for GOE. Since J J2i<Ni^i ~ 
7i)^ d/x^^"^-* ^ CN~^^^ for arbitrary e' is known to hold for GOE (see [12] where this is proved for more 
general Wigner matrices) , we find 



1 



I ^1 ^Qi.-^Oqduj 



1 



^ &i,m do; 



leJ 



\J\ 



The cutoff 6 can be easily removed using the standard properties of dfi^^ ^\ Choosing e' ~ e, replacing e 
with e/2, and recalling (j4.29p completes the proof. □ 

4.1. Verifying the assumptions of Proposition 1431 The estimate (j4.1ip is an immediate consequence of the 
following lemma. 

Lemma 4.7. Let the entries of Aq have the distribution Co- Then for any t > we have 

< A^'(iV™2(Co)-log(l-e-*)), 
where r?T,2(Co) "is the second moment o/Co- 

Proof. RecaU that the relative entropy is defined, for v <^ fj., as S{i^\fj.) ■■= J log ^ di^. If V and /i are 
marginals of v and fj, with respect to the same variable, it is easy to check that S{h'\'p,) ^ Slulfi). Therefore 



SM = S{fM ^ S{At\V) = N^S{Ct\92/N). 
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where Q denotes the law of the off-diagonal entries of At, and gx is a standard Gaussian with variance A 
(the diagonal entries are dealt with similarly). Setting 7 = 1 — e^*, we find from (|4.2p that (t has probability 
density g-y * 527/Arj where Qj is the probability density of (1 — 7)^/^Co- Therefore Jensen's inequality yields 

S{Ct\92/N) = s(^J dygj{y)g2y/N{--y) ^ J dy gj{y)S{g2^/N(,- - y)\92/N) ■ 

By explicit computation one finds 

S{g2j/N{- -y)\92/N) = ^(^Y?/^ -iog7 + 7- • 

Therefore 

S{Ct\92/N) < ^^^2 (Co) - log 7, 
and the claim follows. □ 

The estimate (|4.12p follows from (|4.3|) and p.l9p . It only remains to verify (j4.13p . 
Lemma 4.8. For any t e [to,'''] we have 

{di92){x)\\og{dgt){x)\^ iV^. (4.30) 



Proof. Let Q be the law of an off-diagonal entry a of At (the diagonal entries are treated similarly). From 
we find 

Ct = g-i* 527/Af , 

where 7 = 1 — e~*, g^ is the law of (1 — j^^^Coy a-nd g\ is a standard Gaussian with variance A. Using da 
to denote Lebesgue measure, we find by explicit calculation that 

e <; , 5^ c , 

da 

which gives 

<ig2j/N 

Therefore, the density Ft{A) of the law of A with respect to the GOE measure satisfies 

^.NO-NOTrA- <; Ft{A) < e^^ 

Parametrizing A = A(x, v) using the eigenvalues x and eigenvectors v, the GOE measure can be written in 
the factorized form /x(dx)P(dv), where fj, is defined in (|4.6p and P is a probability measure. Thus we get 
that the density 



/t(x) = J Pt(x,v)P(dv) 
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satisfies 

Next, it is easy to see that 
Using (j4.32p we may now derive an upper bound on (d^gt): 

J dxN 03{xN)ftix,XN)fJ-{x,XN) 



{039t){x) = 



/ dxN tp{x,XN)tl{x,XN) 



^2 / dxN fJ-{x,XN) 



JdxNG ^'^^i' ll{x, Xn) 

Since 



JdxNe-^''-'-ti{x,x^) ^ L^_,dxMe ^ ^« n.<^(^^ - ^Qe" ^ ^ E,<„ 

JdxNK^^XN) J^^_^dxNU^<Ni^N-X,)e-^-N 



by a straightforward calculation, we get 

We now derive a lower bound on {O^gt). Using (|4.32p and (|4.3ip we find 



^ _,;^c J dxN 03{xN)ft{x,XN)n{x,XN) 
J dXN ^Ji{X,XN) 



by a calculation similar to (|4.33p . The claim follows from 

(0i02)(S)|log(%)(J)P 2(0i02)(^)|log0102|' + 2(0i02)(S)|log(035t>(^)l' < 2 + 7V^. □ 



5. Bulk universality: proof of Theorem 12.51 

Similarly to (|2.8p . we define p\%[ ^\xi, . . . ,xn-i) as the probability density obtained by symmetrizing (in 
the variables xi, . . . , xjv-i) the function J dx^ /i(x), and set, for n ^ N — 1, 

p[''l{xi,. . . ,Xn) := y dXn+i • • •dX7V-lp|^~^'(a;i, . . . ,XAf_i) . 

We begin with a universality result for sparse matrices with a small Gaussian convolution. 
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Theorem 5.1. Let E e [-2 + k, 2 — k] for some k > and let b = bN satisfy \b\ ^ k/2. Pick e,(3 > 0, and 
set T ■■= 7V~^"+'^, where 



1 2 

a = a((/)) := min'^ 20 — — , 40 — — 



(5.1) 



Letne'N and O 



2 ' 3 

be compactly supported and continuous. Then there is a f G [t/2,t] such that 



Je ^ ^ J , . . . , a„) ^^J^y, {P% - Pgoe.n) (^-E' 



NQsc{E) 



(5.2) 



Proof. The claim follows from Theorem l4.2l and Theorem l3.41 similarly to the proof of Theorem 2.1 in [IB] . 
We use that Q ^ (log7V)'-^^iV~^", as follows from p.l6p : the contribution of the low probability complement 
event to (|3.16[) may be easily estimated using Cauchy-Schwarz and the estimate J^i ^^^t = Tr < , 
uniformly for t ^ 0. The assumption IV of [16j is a straightforward consequence of the local semicircle law, 
Theorem O □ 



Proposition 5.2. Let A^^^ = (a|^'') and A^"^^ = (a^^f) be sparse random matrices, both satisfying Definition 



with 



.(2) > ^0 



(in self-explanatory notation). Suppose that, for each i,j, the first three moments o/ a^j'' and a^^"* are the 
same, and that the fourth moments satisfy 

\E{a^Y-E{a^Y\ ^ N'^-' , (5.3) 

for some S > 0. 

Let n g N and let F G C^(C"). We assume that, for any multi-index a € N" with 1 ^ jaj ^ 5 and any 
sufficiently small e' > 0, we have 

max||a"F(a;i,...,a;„)| : iV^'l 7V^«^' , maxl\d°'F{xi,...,x„)\:J2\^^\^^( ^ ' 

where Cq is a constant. 

Let K, > be arbitrary. Choose a sequence of positive integers fci, . . . , fc„ and real parameters EJ^ € [— 2 + 
K, 2 — k], where m = 1, . . . ,n and j = 1, . . . , km,. Let e > be arbitrary and choose rj with N~^~^ ^ J] ^ N^^ . 
Set z™ := Ej^ ± irj with an arbitrary choice of the ± signs. 

Then, abbreviating G^^\z) :~ (A''' — z)~^ , we have 



IF 



Tr 



i=i 



Tr 
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Proof. The proof of Theorem 2.3 in [T7| may be reproduced almost verbatim; the rest term in the Green 
fmiction expansion is estimated by an L°°-L^ bound using Elaj-j"*!^ ^ CN^^^^'^. □ 

As in |17j (Theorem 6.4), Proposition 15.21 readily implies the following correlation function comparison 
theorem. 

Theorem 5.3. Suppose the assumptions of Proposition hold. Letp^j^^-^ and p^^^ be n -point correlation 

functions of the eigenvalues of A^^^ and A^'^'^ respectively. Then for any \E\ < 2, any n ^ 1 and any compactly 
supported test function O : M" — > K iwe have 



lirn^ J dai ■ ■ ■ da„ 0(ai , . . . , a„) (p['iIn ^ ^'(2),jv) 



We may now complete the proof of Theorem 12.51 



Proof of Theorem 12.51 In order to invoke Theorems 15.11 and 15.31 '^c construct a sparse matrix ^o, 
satisfying Definition 12. 2[ such that its time evolution Af is close to A in the sense of the assumptions of 
Proposition 15.21 For definiteness, we concentrate on off-diagonal elements (the diagonal elements are dealt 
with similarly). 

For the following we fix i < j; all constants in the following are uniform in i,j, and N. Let ■^,^',^0 be 
random variables equal in distribution to a-y, {af)ij, {a,Q)ij respectively. For any random variable X we use 
the notation X -.^ X — EX. Abbreviating 7 := 1 — e~'^, we have 



where g is a centred Gaussian with variance 1/N , independent of ^o- We shall construct a random variable 
^0, supported on at most three points, such that Aq satisfies Definition 12.21 and the first four moments of 
^' are sufiiciently close to those of ^. For fc = 1, 2, . . . we denote by mk{X) the fc-th moment of a random 
variable X. We set 

eo = -=^-+fo, (5.4) 

VI - 7 7V 

where mi(^o) = and 7712(^0) = N^^. It is easy to see that mk{£,) ~ rnk{£,') for fc = 1, 2. 
We take the law of ^0 to be of the form 

pSa + qS^b + {l-p - q)So 

where a,b,p,q ^ are parameters satisfying p + ^ 1. The conditions toi(^o) = and TO2(^o) — imply 



1 



aN{a + &) ' bN{a + b) ' 

Thus, we parametrize ^0 using a and 6; the condition p + q ^ 1 reads ab ^ N^^. Our aim is to determine a 
and b so that ^0 satisfies (j2.4p . and so that the third and fourth moments of ^' and ^ are close. By explicit 
computation we find 

msi^o) = m4(6)) = iVTO3(fo)' + ^. (5.5) 
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Now we require that a and 6 be chosen so that ab ^ N ^ and 

"^3(Co) = (l-7)~'/'™3(0, rni(Jo) = A^msC^o)' + ^4(0 - A^™3(0' ■ 

Using (|5.5p . it is easy to see that such a pair (a, 6) exists provided that TO4(C) — Nm^{^Y ^ A^^^- This latter 
estimate is generally valid for any random variable with mi = 0; it follows from the elementary inequality 
17141712 — TO3 ^ valid whenever mi = 0. 

Next, using and the estimates m.^i^) = ©(Ar^i""^), nuij) = O(iV~i-20-)^ gj^^ 

a-b ^ 0{N-'^), ab = O(iV~20) ^ 

which implies a,b = 0{N~'^). We have hence proved that ^0 satisfies Definition 12.21 
One readily finds that m3(^') = TO3(^). Moreover, using 

m,{^^)-mS) = Afm3(0'[(l-7)"'-l] = 0{N-'-^^^), 

we find 

m,{l') - m,{l) = (1 - ^fm.il,) + ^ - ^ - = 0{N-'-^'f^) . 

Summarizing, we have proved 

mk{0 = mfc(0 = 1,2,3), |m4(C')-"i4(0l CiV~'-''^f. (5.6) 
The claim follows now by setting 5 — 2a{(j)) + 2(j) — 1 — (3 m (|5.3p . and invoking Theorems 15. II and 15. 31 □ 



6. Edge UNIVERSALITY: PROOF OF Theorem 12.71 

6.1. Rank-one perturbations of the GOE. We begin by deriving a simple, entirely deterministic, result on 
the eigenvalues of rank-one perturbations of matrices. We choose the perturbation to be proportional to 
|e)(e|, but all results of this subsection hold trivially if e is replaced with an arbitrary ^^-normalizcd vector. 

Lemma 6.1 (Monotonicity and interlacing). Let H be a symmetric N x N matrix. For f ^ we set 

A{f) := H + f\e){e\. 

Denote by Xi ^ ■ ■ ■ ^ Xn the eigenvalues of H , and by /ii(/) ^ • • • ^ A*-/v(/) eigenvalues of A{f). Then 
for all a = 1, . . . , N — 1 and f ^ the function Ha{f) is nondecreasing, satisfies /iQ,(0) = Xa, and has the 
interlacing property 

Xa ^ tl-aif) ^ Xa+l ■ (6.1) 

Proof. From [11], Equation (6.3), we find that /i is an eigenvalue of iJ + /|e)(e| if and only if 

V-l(u«,e)P 1 ^g2) 



Xa f 



where Uq, is the eigenvector of H associated with the eigenvalue A^. The right-hand side of (|6.2p has N 
singularities at Ai, . . . , Aat, away from which it is decreasing. All claims now follow easily. □ 
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Next, we establish the following "eigenvalue sticking" property for GOE. Let a label an eigenvalue close 
to the right (say) spectral edge. Roughly we prove that, in the case where H = V \s a. GOE matrix and 
/ > 1, the eigenvalue of + /|e)(e| "sticks" to Xa+i with a precision {log N)^^N-^. This behaviour 
can be interpreted as a form of long-distance level repulsion, in which the eigenvalues fip, /3 < a, repel the 
eigenvalue and push it close to its maximum possible value, Xa+i- 

Lemma 6.2 (Eigenvalue sticking). Let V be an N x N GOE matrix. Suppose moreover that ^ satisfies 
p.lOP and that f satisfies f ^ 1 + eo- Then there is a 6 = d{eo) > such that for all a satisfying 
N{1 — S)^a^N— 1 we have with (^, i')-high probability 



\Xa+i — Mq I ^ 



(logiV) 
N 



Similarly, if a instead satisfies a ^ NS we have with {^,i>)-high probability 



N 



(6.3) 



(6.4) 



For the proof of Lemma Wm we shall need the following result about Wigner matrices, proved in [T9] . 

Lemma 6.3. Let H be a Wigner matrix with eigenvalues Ai ^ • • • ^ Xn and associated eigenvectors 

Ui,...,UAr. Assume that ^ is given by (j3.ip . Then the following two statements hold with (S^,v)-high 
probability: 

(logiV)'^« 



max Uq| 



N 



(6.5) 



|Aa-7a| {\ogNf^N~'^/^{m:in{a,N+l~a}) 



-1/3 



(6.6) 



Moreover, let L satisfy p. lip and write G^(z) := [(i/ 



Then we have, with (^, v)-high probability, 



where was defined in (|3.2I 



P 1^ max^|G,^(z) - 5.,m,e(z)| (log 7V)^« (^^^ 



Im msc(z) 1 



(6.7) 



Proof of Lemma 16.21 We only prove (|6.3p : the proof of (|6.4p is analogous. By orthogonal invariance of 
V, we may replace e with the vector (1,0,. ..,0). Let us abbreviate C,p ■= \up{l)\'^. Note that (|6.5p implies 



maxC/3 ^ {logNf^N-'^ 



(6.8) 



with (^, i')-high probability. Now from we (|6.2p wc get 



Mo 



Aq+i 



E 



C/3 



l-la — Xfj 



which yields 



|Ao,+i — Mo 



E 



Xp — Ma 



1 

7' 

1 " 
7 



(6.9) 
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We estimate from below, introducing an arbitrary 77 > 

C/5 _ V- C/3 



E 



C/3 



,9<Q + 



/3>a+l 

C/3(-^/3 - Ma) 



E 



/3>Q+1 ^ P 



(6.10) 



where in the third step we used that A^+i ^ jia by 

We now choose ?; = (logiV)'^i logiogAf^-i^ Ci large enough, we get from (jej)) that GXl{^ia + if?) 
'niscilJ'a + i?/) + 0(1). Therefore (|3.6p yields 



RcGi;(Ma + i?7) ^ l-2v/|2-M«l + o(l). 



(6.11) 



From (|6.6p and (|6.ip wc get that l/i^ — 7q| ^ (log iV)'-^^iV '^/'^ with (^, i^)-high probability. Moreover, the 
definition (|3.15p and a ^ iV(l — 6) imply ^ 2| ^ C5^/^. Thus we get, with (^,i^)-high probability, that 
|2 - = 0(1) + C(52/3. Therefore yields, with (^, i/)-high probability, 

-'ReGXl{^io.+l'n) > 1 + 0(1) - C(5i/3 . 
Recalling ()6.8p . we therefore get from (|6.10p . with (^, i')-high probability, 



E 



— /lo 



^ 1 + 0(1) 



1/3 m(logjV)'^^ (logiV)gg ^ 1 



iV3 ^ |A/3-/ia|3 



(6.12) 



for any m € N. 

Next, from (|6.6p we find that, provided C2 is large enough, m := (log Af)'"^^, and /3 > a + m, then we 
have with (^, i')-high probability 



I A/3 - Aa+i| > |7/3 - 7a + i| 



(logiV)C« 



iV2/3(7V + 1 - /3)i/3 
Then for C2 large enough we have, with (^, i')-high probability, 

1 „ NT^ 1 



> cYlp - 7a+l| ■ 



E 



1-^/3 - MaP 



^ E 



/3>a+Tn ^>Q:+m 

Thus we get from (|6.12p , with i^)-high probabihty, 



(7^-7a+i)^ (log7V)3C.? 



E 

/3#Q + 1 



A^ — /ia 



^ iV3|A„+i -M«|3 
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Plugging this into (|6.9p and recalling that / ^ 1 + £o > 1 yields, with zy)-high probability, 



from which the claim follows. □ 

6.2. Proof of Theorem 12.71 In this section we prove Theorem 12.71 bv establishing the following comparison 
result for sparse matrices. Throughout the following we shall abbreviate the lower bound in (|2.7p by 

/* := l + £o. (6.13) 

Proposition 6.4. Let P'^ and P^ be laws on the symmetric N x N random matrices H , each satisfying 
Definition l2j\ with q ^ N'*' for some (f) satisfying 1/3 < (/) 5j 1/2. In particular, we have the moment 
matching condition 

Wh,j = W"h,j = 0, Whfj = Whj^ = ^. (6.14) 

Set f :~ in Definition \2.2\ A = A{f^) = {oij) ■= i7+/*|e)(e|. As usual, we denote the ordered eigenvalues 
of H by Xi ^ • • • ^ Ajv and the ordered eigenvalues of A by ^li ^ • • • ^ yU^v- 
Then there is a S > such that for any s € M. we have 



P^(^N^/^{Xn - 2) sC s - N-^'j 



'(iV2/3(A„ - 2) ^ s) V^(^N^/^{\n -2) s + N-^^ + N-^ (6.15) 



as well as 



F^(^N^^^{fiN-i - 2) s - N-^^ 

<: P^(Ar2/3(^^_^ - 2) ^ s) P^(iV2/3(^j^„^ _ 2) ^ s + TV"'') + N'^ (6.16) 

for N ^ iVo sufficiently large, where Nq is independent of s. 

Assuming Proposition 16.41 is proved, we may easily complete the proof of Theorem 12.71 using the results 
of Section 

Proof of Theorem 12.71 Choose P"^ to be the law of GOE (see Remark [^^. and choose P"^ to be the law 
of a sparse matrix satisfying Definition 12.11 with q ^ N'^. We prove (|2.1ip : the proof of (|2.12p is similar. 

For the following we write Ha{f ) = Mq to emphasize the /-dependence of the eigenvalues of A{f). Using 
first dUT]) and then (|6J5)) we get 

pw|^7V2/3(^jv-i(/) -2) s$ s) ^ P^(^7V2/3(^^_2) ^s) ^ P^(iV2/3(;^^_2) sCs-iV-'^) 

for some (5 > 0. Next, using first the monotonicity of fia{f) from Lemma [6.11 then (j6.16p . and finally (|6.3p . 
we get 

P-(7V2/3(^^_^(^)_2) s; F'"(^N^/'{fiN-i{f*)-2)^s) 

^ pv|^7V2/3(^^_i(/,) -2) =^ .s + iV-^) P^(7V2/3(Aa,-2) =^.s + 2iV-*) +2iV-^ 

for some (5 > 0. This concludes the proof of (|2.1ip . after a renaming of 6. □ 
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The rest of this section is devoted to the proof of Proposition [HUl We shall only prove (|6.16|) . The proof 
of (|6.15p is similar (in fact easier), and relies on the local semicircle law, Theorem 13.31 with / = 0; if / = 
some of the following analysis simplifies (e.g. the proof of Lemma 16.81 below may be completed without the 
estimate from Lemma 16.90 

From now on we always assume the setup of Proposition l6.4l In particular. / will always be equal to 
We begin by outlining the proof of Proposition 16.41 The basic strategy is similar to the one used for 
Wigner matrices in [19] and [25]. For any Ei ^ £2, let 

N{Ei,E2) \{a:Ei s$ m« ^2}| 

denote the number of eigenvalues of A in the interval [i?i,i?2]. In the first step, we express the distribution 
function in terms of Green functions according to 

P"(mw-i^^) = E"X(A/'(£;,oo) - 2) w ¥.'^K{U{E,E.,) - I) w E'^kI^J dy N liam{y + ir]) - ij . 

(6.17) 

Here u stands for either v or w, 77 iV ^/"^ ^ for some e > small enough, K -.R ^ IR+ is a smooth cutoff 
function satisfying 

K{x) = l if |a;K 1/9 and K{x) = if |a;|^2/9, (6.18) 

and 

E., := 2 + 2(logNf"^N-'^^^ (6.19) 

for some Co large enough. The first approximate identity in (j6.17p follows from Thcorcm l3.4l which guarantees 
that fiN-i ^ £■* with (^, j/)-high probability, and from p.2ip which guarantees that ^ 2 + cr with (^, v)- 
high probability. The second approximate identity in (j6.17p follows from the approximation 

/ ' dyNlmmiy + iij) ^ J2 [ ^ 7 ~ AA(Si,£^2), 

Je-l „ J El [y-f^a) +'11 

which is valid for Ei and E2 near the spectral edge, where the typical eigenvalue separation is N""^/^ ^ rj. 

The second step of our proof is to compare expressions such as the right-hand side of (|6.17p for u = v and 
u = w. This is done using a Lindcberg replacement strategy and a resolvent expansion of the argument of 
K. This step is implemented in Section l673l to which we also refer for a heuristic discussion of the argument. 

Now we give the rigorous proof of the steps outlined in (j6.17p . We first collect the tools we shall need. 
From (|3.18p and p.21|) we get that there is a constant Co > such that, under both P"^ and P^, we have 
with (^, ;/)-high probability 

\N^/^{^JLN-l-2)\ «C (log7V)^««, ^ 2 + a, (6.20) 

and 

^(,_«!^,,,?(!^),„o.«)-^ (8.1) 

Therefore in (|6.16p we can assume that s satisfies 

- (logA^)*^"* ^ s ^ {\ogNf°^. (6.22) 
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Recall the definition (|6.19p of E^, and and introduce, for any E ^ E^,, the characteristic function on the 
interval [i^ji?*], 

Xe := • 

For any > we define 

^"(^) ( 2I 2^ = -Im^- (6.23) 

to be an approximate delta function on scale 77. 

The following result allows us to replace the sharp counting function J^{E,E^,) — TixsiEt) with its 
approximation smoothed on scale ry. 

Lemma 6.5. Suppose that E satisfies 

\E-2\N^^^ s$ (logiV)'^««. (6.24) 

Let ( -.^ iiV^^/'^^^ and rj iV^^^'^"^'^, and recall the definition of the function K from (|6.18p . Then the 
following statements hold for both ensembles P'^ and P'*^. For some £ > small enough the inequalities 

Tr(xis+f * e-n){H) - N'^ N{E, 00) - 1 ^ ^^{xe-i * e^){H) + N^' (6.25) 

hold with {£^,v)-high probability. Furthermore, we have 



(TY{xE-e * 0„){H)') < P(AA(S,oo) = l) Ei^(Tr(xis+f * 0„)(i/)) + c-'^('°s^)' (6.26) 



for sufficiently large N independent of E, as long as ()6.24p holds. 

Proof. The proof of Corollary 6.2 in [19] can be reproduced almost verbatim. In the estimate (6.17) of 
[19], we need the bound, with (^, :^)-high probability, 

\m{E + ie)-m,,iE + ie)\ ^ ^^^^^ 
for Af^^+^ ^ ^ ^ N^'^/'^. This is an easy consequence of the local semicircle law p.l2p and the assumption 

Note that, when compared to Corollary 6.2 in |19| . the quantity M{E, 00) has been incremented by one; 
the culprit is the single eigenvalue fiN ^ 2 + a. □ 

Recalling that On{H) = ilmG(i?7), Lemma [6.51 bounds the probability of A/'(i?,oo) = 1 in terms of 
expectations of functionals of Green functions. We now show that the difference between the expectations of 
these functionals, with respect to the two probability distributions P"^ and P"^, is negligible assuming their 
associated second moments of hij coincide. The precise statement is the following Green function comparison 
theorem at the edge. All statements are formulated for the upper spectral edge 2, but with the same proof 
they hold for the lower spectral edge —2 as well. 

For the following it is convenient to introduce the shorthand 

le := {x : Ix - 2K iV-2/3+'^} (6.27) 

where e > 0. 
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Proposition 6.6 (Green function comparison theorem on the edge). Suppose that the assumptions 
of Proposition \6.Ji\ hold. Let : M — >■ R 6e a function whose derivatives satisfy 



sup|^^(")(x)|(l + |.T|)-c^i s; Ci for n = 1,2,3,4, 



(6.28) 



with some constant Ci > 0. Then there exists a constant £ > 0, depending only on Ci, such that for any 
e < e and for any real numbers E, £'2 G 1^, and setting rj :~ N^'^l'^^^ , we have 



WF {Nr] Im to(z)) - E^F {Nt] lmm{z)) 



s; CiVi/3+C£g-i z = E + ir], 



and 



E2 \ / I-E2 > 

E^F \ N I dy lmm{y + hj) - E'^F \N dy lmm{y + ir/) 



for some constant C and large enough N . 



(6.29) 



(6.30) 



We postpone the proof of Proposition 16.61 to the next section. Assuming it proved, we now have all the 
ingredients needed to complete the proof of Proposition [ 



Proof of Proposition 16.41 As observed after (|6.20p and (|6.2ip . we may assume that (|6.22p holds. We 
define E := 2 + sN-'^/'^ that satisfies (pTM]) . We define E^ as in (pl^ with the Co such that and 
(|6.2ip hold. From (|6.26p we get, for any sufficiently small e > 0, 



where we set 



K{^r{xE-t*e,,){H)) ^ P-(AA(i;,oo) = l) 



3.31) 



Now (|6.30p applied to the case Ei ^ E ~ £ and E2 — E^ shows that there exists a S > such that for 
sufficiently small e > we have 

E^K(TT{xE-e*Or,){H)') ^ E"^ K (TiixE-e * 0^){H)^ + (6.32) 

(note that here 9e plays the role of e in the Proposition 16. 6p . Next, the second bound of (|6.26p yields 

P^(AA(i;-2£,oo) = 1) E^/^(Tr(x£;-£*6',^)(i^)) +e-''(i°sA')« (6.33) 

Combining these inequalities, we have 

P^iJViE - 2£, 00) = 1) s; F'^iJViE, c5o) = 1) + 2N-^ (6.34) 

for sufficiently small e > and sufficiently large N. Setting E = 2 + sN^^^^ proves the first inequality of 
(|6.16p . Switching the roles of v and w in (|6.34p yields the second inequality of (|6.16p . □ 
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6.3. Proof of Proposition 16.61 All that remains is the proof of Proposition 16. 6[ to which this section is 
devoted. Throughout this section we suppose that the assumptions of Proposition[6]4]hold, and in particular 
that / = 1 + eo. 

We now set up notations to replace the matrix elements one by one. This step is identical for the proof 
of both (|6.29|) and (|6.30p : we use the notations of the case (|6.29p . for which they are less involved. 

For the following it is convenient to slightly modify our notation. We take two copies of our probability 
space, one of which carries the law P"^ and the other the law P'*^. We work on the product space and write 

for the copy carrying the law P"*' and for the copy carrying the law P'*^. The matrices and are 
defined in the obvious way, and we use the notations = (vij) and A^ — (wij) for their entries. Similarly, 
we denote by ^^{z) and G^(z) the Green functions of the matrices A^ and A"^ . 

Fix a bijective ordering map on the index set of the independent matrix elements, 



4> : {{i,j) : I ^ j !^ N} |o, . . . , 7max| where 7„ 



N{N +1) 



- 1. 



and denote by A^ the generalized Wigner matrix whose matrix elements a^j follow the u-distribution if 



^ 7 and they follow the w-distribution otherwise; in particular Aq ~ A'^ and A^ 
Next, set 77 := N~^^^~^ . We use the identity 

lmmsc{E + irj) sC ^/\E-2\ + 7] ^ CiY-^/^+^Z^ 

Therefore Theorem 2.9 of [11] yields, with zy)-high probability, 



= A" 



max max max 



A. 



6kimsc{E + b]) 



< 



where we defined 



1 
P 



We set z = + irj where E £ and 77 = TV^^/a-e^ Using ([QT)) . and the identity 

Imm = ^ImTrG ^Yj^'^G^^ ' 



3.35) 



(6.36) 



3.37) 



we find, as in (6.36) of [19], that in order to prove (|6.29p it is enough to prove 



EF\t]^J2 GJjG]^ I - (G^ ^ G^) 



(6.38) 



at z — E + b]. We write the quantity in the absolute value on the left-hand side of (|6.38p as a telescopic 
sum, 



«^(.rE(3^ 



EF(yl^ ^ A^) 



= -J2{^ ^(^^ -^^-y)-^ Fi^^ A-i)) • (6-39) 
7=2 
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Let denote the matrix whose matrix elements are zero everywhere except at the position, 

where it is 1, i.e. iS^f^ := SikSji. Fix 7^1 and let {b,d) be determined by 4>{b,d) = 7. For definiteness, 
we assume the off-diagonal case h ^ d\ the case b = d can be treated similarly. Note that the number of 
diagonal terms is N and the number of off-diagonal terms is 0{N^). We shall compare with for 

each 7 and then sum up the differences in (j6.39p . 

Note that these two matrices differ only in the entries (6, d) and (6, d), and they can be written as 

= Q + V where V := {vm - ^vm)E^'"'^ + {vm - Evdb)E'^'"'^ , (6.40) 

and 

= Q + W where W := {wm - Ewm)E^'"''> + {wm - Ewdb)E<^''''^ , 
where the matrix Q satisfies 

Qbd = Qdb = f/N = Evbd = lEudb = Ewbd = Ewdb , 
where, we recall / = 1 + ^o- It is easy to sec that 

max|i;y | -Hmax|wy | {\ogN)'^^q~'^ (6.41) 

with (^, i')-higli probability, and that 

Evij = Ew.j- = 0, 'K{v^jf = ^{w^jf ^ C/N, E\v,j\'' +E\w,j\'' s: CN~^q^~'' (6.42) 

for k = 2,3,4,5,6. 

We define the Green functions 

^^-Q^' ^-^^3^' ^--A^^- ^'-''^ 
We now claim that the estimate (|6.36p holds for the Green function R as well, i.e. 

max max\Rki{E + irj) - Skinisc{E + b-j)\ ^ p^^ (6.44) 

holds with (f , t/)-high probability. To sec this, we use the resolvent expansion 

R = S + SVS + {SVfS +... + {SV)^S + {SVy"R. (6.45) 

Since V has only at most two nonzero elements, when computing the entry (fc, /) of this matrix identity, 
each term is a sum of finitely many terms (i.e. the number of summands is independent of N) that involve 
matrix elements of 5 or i? and u^ , e.g. of the form {SVS)ki = SkiVijSji + SkjVjiSu. Using the bound (j6.36p 
for the 5 matrix elements, the bound (|6.4ip for Vij and the trivial bound \Rij \ ^ rj^^ ^ N, we get (|6.44p . 

Having introduced these notations, we may now give an outline of the proof of Proposition 16.61 We have 
to estimate each summand of the telescopic sum (|6.39p with h ^ d (the generic case) by o{N~^); in the 
non-generic case b = d, a, bound of size o{N~^) suffices. For simplicity, assume that we are in the generic 
case b ^ d and that F has only one argument. Fix z = E + iri, where E <E Ig (see (|6.27p ) and 77 N^^^^^^ . 
Define 

/ v'-Y,S^Jiz)S~(^■, (6.46) 
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the random variable y is defined similarly. We shall show that 

for some deterministic B which depends only on the law of Q and the first two moments of vm- From (|6.47p 
we immediately conclude that (|6.29p holds. In order to prove (|6.47p . we expand 

Fiy') - = - y«) + - y^)2 + ^^'"(Ol/ - y^f , (6.48) 

where lies between y^ and y^. Next, we apply the resolvent expansion 

S = R + RVR+iRVfR+... + {RVy"R+{RV)'''+^S (6.49) 

to each factor S in (|6.48p for some m ^ 2. Here we only concentrate on the linear term in (|6.48p . The second 
term is dealt with similarly. (The rest term in (|6.48p requires a different treatment because F"'{() is not 
independent of Vbd- It may however by estimated cheaply using a naive power counting.) By definition, Q is 
independent of Vbd, and hence F'{y^) and R are independent of hbd- Therefore the expectations of the first 
and second order terms (in the variable Vbd) in {y^){y^ — y^) are put into B. The third order terms in 
¥,F' {y^'){y^ — y^) are bounded, using a naive power counting, by 

?7^iVV^E|wM|^ ^ N^'^I^N'^p-^N-^q-^ . (6.50) 

Here we used that, thanks to the assumption i ^ j, in the generic terms {i,j} D {b, d} = there are at least 
three off-diagonal matrix elements R in the resolvent expansion of (j6.46p . Indeed, since b,d ^ {hj}, the 
terms of order greater than one in ()6.49p have at least two off-diagonal resolvents matrix elements, and other 
factor in (|6.46|) has at least one off-diagonal resolvent matrix element since i 7^ j. Thus we get a factor 
by (|6.36p (the non-generic terms are suppressed by a factor N^^). Note that the bound (|6.50p is still too 
large compared to N^, since p ^ 7V~^/^. The key observation to solve this problem is that the expectation 
of the leading term is much smaller than its typical size; this allows us to gain an additional factor p^^. A 
similar observation was used in [19], but in the present case this estimate (performed in Lemma l6 .81 below) is 
substantially complicated by the non- vanishing expectation of the entries of A. Much of the heavy notation 
in the following argument arises from the need to keep track of the non-generic terms, which have fewer 
off-diagonal elements than the generic terms, but have a smaller entropy factor. The improved bound on 
the difference EF'{y"){y^ - y^) is 

7V-4/3^2p-4^-l^-l ^ ^-1/3^-4^-1^ 

which is much smaller than N^'^ provided that q ^ N"^ for i/i > 1/3 and e is small enough. 
The key step to the proof of Proposition 16.61 is the following lemma. 

Lemma 6.7. Fix an index 7 = (l){b,d) and recall the definitions of Q, R and S from (|6.43p . For any small 
enough £ > and under the assumptions in Proposition \6.6l there exists a constant C depending on F (hut 
independent of and constants Bn and Dn , depending on the law law((5) of the Green function Q and on 
the second moments m2{vbd) ofvbd, such that, for large enough N (independent ofj) we have 

EF f ?7 / "dy VSy5j,(y + i77) I -EfItj [ dy V %i?j,(y + i/y) ) - BN{m2{vbd)Mw{Q)) 
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where, we recall, i] ^ N ^/'^ ^, as well as 



\ / \ i^J J 

(6.52) 

where z = E + iij. The constants and Dat may also depend on F, but they depend on the centered random 
variable Vbd only through its second moments. 

Assuming Lemma 16.71 we now complete the proof of Proposition 16.61 



Proof of Proposition 16.61 Clearly, Lemma [6Jl also holds if S is replaced by T. Since Q is independent 
of and Wbd, and m2{vbd) = "i2(w&d) = 1/^, we have £)jv("i2(whd), law(Q)) = L»Ar(77i2(wfc(i), law((3)). 
Thus we get from Lemma [6771 that 



\F I 7?2^S.,%(z) -EF ifY.T,,T,,{z) 



(6.53) 



Recalling the definitions of S and T from (j6.43p , the bound (|6.53p compares the expectation of a function 
of the resolvent of and that of The telescopic summation in (|6.39p then implies (|6.38p . since the 

number of summands with 6 7^ d is of order N'^ but the number of summands with 6 = d is only N . Similarly, 
(|6.5ip implies (|6.30p . This completes the proof. □ 



Proof of Lemma [6.71 Throughout the proof we abbreviate A^-i = A = {oij) where = hij + f/N. We 
shall only prove the more complicated case (|6.5ip : the proof of (j6.52p is similar. In fact, we shall prove the 
bound 



EF I 77 / ' dyY, S^3SJ^{y + iv)j -EFI^t] ' dz^^ %i?j,(y + iry) j - BN{m2{hM)Mw{Q)) 



El 



from which (|6.5ip follows by (|6.37p . 
From (|6.36p we get 



max ma.x\Ski{E + iri) - Skimsc{E + iri)\ ^p 



(6.55) 



with (^, i/)-high probability. Define as the event on which (|6.55p . (|6.44p . and (|6.4ip hold. We have proved 
that Vt holds with (^, i/)-high probability. Since the arguments of F in (|6.54p are bounded by CiV^"'"^^ and 
F{x) increases at most polynomially, it is easy to see that the contribution of the event Uf^ to the expectations 
in (|6.54p is negligible. 
Define and by 



-El 



dy^S,jSji{y + \Ti) , 



dy^R,jRj,{y + 



3.56) 



35 



and decompose x into three parts 



dy l(|{i, 7} n {b,d}\ = k) SijSji{y + i??) ; (6.57) 

is defined similarly. Here k = |{i,j} H {6, ci}| is the number of times the indices h and d appear among 
the summation indices Clearly fc = 0, 1 or 2. The number of the terms in the sum of the definition of 
is 0{N^~''). A resolvent expansion yields 

S = R- RVR + [RVfR - [RVfR + {RVfR - {RVfR + [RVfS . (6.58) 

In the following formulas we shall, as usual, omit the spectral parameter from the notation of the resolvents. 
The spectral parameter is always y + irj with y g [Ei, E2]] in particular, y G /g. 

If n <i}| = fc, recalling that i ^ j we find that there are at least 2 — fc off-diagonal resolvent 

elements in [{RV)"'R]^^, so that (|05)) yields in 

\[{RV)"'R]J < Cra{N'q-^)"'p-^'^-''^ where m G N+ , to<6, fc==0,l,2. (6.59) 
Similarly, we have in n 

\[{RV)"'S]J < C,n{N^q'^)"'p'^^^''^ where m € N+ , m 6 , fc = 0,l,2. (6.60) 

Therefore we have in f2 that 

\xi - sS, CN'^/^-''p~^^'''^N^q-^ for fc = 0,l,2, (6.61) 

where the factor TV^/s-fc (^omcs from X^i^^ji V ^^'^ I'^^- Inserting these bounds into the Taylor expansion 
of F, using 

q N'l' Ari/3+Ce j> ^ ^ ^^/^^^e (g_g2) 
and keeping only the terms larger than 0{N^^/'^^'-^^p^^q^^)^ we obtain 

E(F(a;^) - F(x-^)) - E (^F'(x^)(x^ - x^) + i^^"(x«)(4 - x^f + F'{x''){xl - )^ 

^ CiV-l/3+CCp-4^-l ^ (g g3) 

where we used the remark after (j6.55p to treat the contribution on the event 57. Since there is no X2 appearing 
in (|6.63p . we can focus on the cases fc = and fc = 1. 
To streamline the notation, we introduce 

- {-ir[{RvrR]^^. (6.64) 

Then using (|6.59p and the estimate max^^j \Rij\ ^ we get 

< C„^{N'q-Tp-^'-'^+'"-'°' . (6.65) 
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Now we decompose the sum — according to the number of matrix elements htd and hdb- To that 
end, for k £ {0, 1} and s,t £ {0, 1, 2, 3, 4} and s + t ^ 1, we define 



E2 



dyY, l{\{^,J}n{b,d}\^k) E^R^, 



and set 



(s,t) 
k 



s+t=e 



Using (|6.65p we get the estimates, vahd on il, 

< C,t(7V^g-l)'+*7V2/3-fep-(4-2fc)+5o.5o.+5o*5o. ^ |qW| 

where i ^ 1. Using (|6.62p . ()6.59p . and (|6.60p . we find the decomposition 



(6.66) 



(6.67) 



(6.68) 



E Q 



(s,t) 



■0{N 



-l/3+Ce„-4 -1 



where s and t arc non-negative. By (|6.68p and (|6.44p we have for s + 1 ^ 1 



{s,t) 



P 



(6.69) 



(6.70) 



where E,{,d denotes partial expectation with respect to the variable hbd- Here we used that only terms with 
at least two elements hbd or hdb survive. Recalling (j6.42p . we find that taking the partial expectation Eb^ 
improves the bound (|6.68p by a factor /N . Thus we also have 



Similarly, for s + t ^ 1 and u + w ^ 1 we have 



^ g2-s-t-u-i,^l/3-2fe^ 



which implies 



QV'Q 



^ ^2-<'i-f2^(l/3-2fe)+Cep-6+2fe 



(6.71) 



(6.72) 



(6.73) 



Inserting (|6.7ip and (|6.73p into the second term of the left-hand side of (|6.63p . and using the assumption F 
as well as (|6.62p . wc find 

F'(x«)(4 - x^) + F'(x«)(xf - ) + iF"(x«)(4 - 4r 
= B + EF'(.T^)Q^=^) + O [^N-'/^+^'pS-') 

^ B + EF'ix^) (g(°"^) -I- q(^^°') + O (7V-i/3+Cep-4g-i) ^ (6.74) 
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where we defined 



B 



(6.75) 



Note that B depends on Hm only through its expectation (which is zero) and on its second moment. Thus, 
B will be i?Ar(m2(wM), law(Q)) from (|6.5ip . 

In order to estimate it only remains to estimate EF' {x^)Q^q'^'' and EF' {x'^)Q^o'°\ Using (PTTOI . 

dull), and (PTTj) . we have 



L{F{x'^) - F{x^)) - B 



3.76) 



which implies (|6.54p in the case b ^ d. 

Let us therefore from now on assume b ^ d. Since we estimate Qq'^'"'' and Q'i^'^\ this implies that i,j, b, d 
are all distinct. In order to enforce this condition in sums, it is convenient to introduce the indicator function 
X = xii,j,b,d) ■.= l{\{i,j,b,d}\ = 4). 

Recalling (j6.64p . we introduce the notation i?^™'*'* to denote the sum of the terms in the definition ()6.64p 
of i?,^"*'' in which the number of the off-diagonal elements of R is s. For example. 



Then in the case x = 1 we have 



Rif'^^ — RibhbdRddhdbRbbhbdRdj + RidhdbRbbhbdRddhdbRbj 



R 



(3) 



s=2 



Now from the definition (|6.66p we get 



i(0,3) 



= Eq^' 



(0,3,s) 



where Qq^''^'"'' 



:= ?7 



s=2 



£^d,^x<i?. 



(0) p(3,s) 
■ji > 



and 



(3,0) 



(3,0, s) 



where Qq 



(3,0, s) 



As above, it is easy to see that for s ^ 3 we have 

which implies, using (|6.74p . 

|E(F(2;«) - F(.T«)) -B\ ^ EF'(x«)q(°''''^ + EF'(x«)q(''°''^ + Ar-V3+Ce^-i 



(6.77) 
(6.78) 

(6.79) 

(6.80) 

(6.81) 
(6.82) 
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By symmetry, it only remains to prove that 



(6.83) 



Using the definition (|6.79p and the estimate (|6.44p to replace some diagonal resolvent matrix elements with 



nisc, we find 



i?A^(3.0,2) 



El 
E2 

El 



dyEF'{x'')J2: 



RibhbdRddhdbRbbhbdRdjRji + {b ^ d) 

ml^RibRdjRji (Efcd Ihbdl^hbd) (b^d) 



(6.84) 



where we used the estimate | E^d | ft-bd P | ^ 1^ to control the errors for the replacement. Combining (|6.84p 



with (|6.74p and (|6.63p . we therefore obtain 

\E[F{x^)- Fix^)]- B\ < 



\EF'{x'^')R^JRJbRd^\ + \EF'{x'^)R,bRd3RJ^\ + {b^d) , (6.85) 



and that every estimate is uniform in y. 



X max max x 

where we used the trivial bounds on F' and 

In order to complete the proof of Lemma 16. 7[ we need to estimate the expectations in (|6.85p by a better 
bound than the naive high-probability bound on the argument of E. This is accomplished in Lemma 
below. From Lemma 16.81 and (|6.85p we get in the case b d that 



\E[F(x^)^ F{x^)]- B\ sC N-^^^+'^'p-^q-^ 
where B was defined in (|6.75l) . This completes the proof of Lemma 16.71 



(6.86) 
□ 



Lemma 6.8. Under the assumptions of Lemma \6.7[ in partieular fixing f — and assuming that a,b,i,j 
are all distinct, we have 



max\EF'{x'')R,bRd3Rjiiy + m < Cp-\ 
The same estimate holds for the other three terms on the right-hand side of (|6.85p . 



(6.87) 



In order to prove Lemma |6.8[ we shall need the following result, which is crucial when estimating terms 
arising from the nonvanishing expectation of Eoy = fN^^. Before stating it, we introduce some notation. 

Recall that we set A = A^^i = (oij), where the matrix entries are given by aij = hij + f /N and 
Ehij = 0. We denote by A'-''^ the matrix obtained from A by setting all entries with index b to zero, i.e. 
(A('')).y := l{i ^ b)l{j ^ b)a,j. U Z = Z{A) is a function of A, we define Z^^^ := Z{A^^^). See also 
Definitions 5.2 and 3.3 in [11]. We also use the notation E^ to denote partial expectation with respect to all 
variables (aih, . . . , anb) in the 6-th column of A. 

Lemma 6.9. For any fixed i we have, with (^, v)-high probability, 



k^i l^k 



39 



Proof. The claim is an immediate consequence of Proposition 7.11 in [TT] and the observation that, for 
E e L,7] = iV^2/3-£^ N'l' we have 



(logiV)^ _ + 



1 /imm. 



□ 



for large enough N . 

Another ingredient necessary for the proof of Lemma 16.81 is the following resolvent identity. 
Lemma 6.10. Let A ~ [uij) be a square matrix and set S = [Sij) = {A — z)^^ . Then for i ^ j we have 

Sij — —Sii'^^aikS\?- , Sij — ~Sjj'^2^ik'^kj ■ (6.88) 

Proof. We prove the first identity in (|6.88p : the second one is proved analogously. We use the resolvent 
identity 

(6.89) 



S,^ ^ + forz,j^fc 



Skk 



from P^, (3.8). Without loss of generality we assume that z ~ Q. Then (|6.89p and the identity AS = 1 
yield 

^ajfc5'^j^ = Qjfc.S'fcj — g-ifc *^ = —auSij — ^{l — auSu) = — ^ . □ 
k^i k^i k^i " " " 

Armed with Lemmas 16.91 and 16.101 we may now prove Lemma 16.81 

Proof of Lemma [6781 With the relation between R and S in (|6.45p and (|6.59p . we find that (|6.87p is 
implied by 

" (6.90) 



under the assumption that b^d^i^j are all distinct. This replacement is only a technical convenience when 
we apply a large deviation estimate below. 

Recalling the definition of after (j6.55p . we get using (|6.89p 



SibSbj{Sbb) ^ 



< Cp- 



in n . 



This yields 
Similarly, we have 



in n . 



(6.91) 
(6.92) 
(6.93) 

(6.94) 

Since (x"^)'^''^ and 5'^''S'^^^ are independent of the 6-th row of A, we find from (j6.94p that (|6.90p . and hence 



SibSdjSji SibS^j^j^ Sj^'' 



^ Cp' 



in n . 



Hence by assumption on F we have 

\EF\x'')S,bSd^,\ ^ E(i^'((x^)W))5,,5(fsJ^ +0(p-47V^-) 



dj '-'ji 

.87p . is proved if we can show that 



EbS^b = 0{p-'') 



(6.95) 
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for any fixed i and b. 

What remains therefore is to prove (|6.95p . Using (|6.55p and (|6.9ip wc find in il that 



Sbb = msc + 0{p-'), Sll^ = O(p-i). (6.96) 



Using flfcb = hkb + f/N we write 



By (|6.96p and the large deviation estimate (3.15) in [TT], the second sum in (|6.97p is bounded by 0{p ^) 
with (^, i')-high probabihty. Therefore, using (|6.96p and K^hkb = 0, we get 

KbS^b = ^ E 4'' + 0{p-') = ^ + Oip-') , (6.98) 

k^b kyti 

where in the second step we used (|6.89p . 

In order to estimate the right-hand side of (|6.98|) . we introduce the quantity 



N 

Note that X depends on the index which is omitted from the notation as it is fixed. Using ()6.88p . (|6.96p . 
and (|6.89p as above, we find with (^, i/)-high probabihty 

^~EE^^^4^^- + ^)+o(^-^) 

k^i l=tLk ^ ^ 

_ -mscf \ - \ - „(fc) 



EE4^ + o(p-) 



iV2 

k^i l^k 



N 



e/X + o('lE(^,,-EzS,,)) +0(p-') 



N 



Now recaU that the spectral parameter z = E + irj satisfies E ^ Ig, (see (|6.27p ) and rj = N ^/"^ ^ . Therefore 
p.6p implies that msc{z) = —1+0(1). Recalling that / = 1+eo, we therefore get, with (^, ;^)-high probability, 

X - o('^E(5,z-E;5,,)') +0(p-2). (6.99) 
We now return to (|6.98p . and estimate, with (^, i')-high probability 



hOih 
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Together with (|6.99p this yields, with i')-high probabihty, 




(6.100) 



In order to estimate the quantity in parentheses, we abbreviate IKkZ ■■= Z — E^Z for any random variable 
Z and write, using ()6.88|) . 

k^i k^i l^k 

= ^ E E + ^) - ^ E - -^^) E 4^' + i) • 

k^i t^k ^ ^ k^i l^k ^ ^ 

Using the large deviation estimate (3.15) in [TT], (|6.96p . and the bound \hik\ ^ which holds with (^, z^)- 
high probability (see Lemma 3.7 in [H]), we find that the second term is bounded by 0{p~^) with (^, z^)-high 
probability. Thus we get 

1^(^..-E,^.,) = ^EE4'^^'^- + o(p-) 

k=/^i l=ik 

with (^, i')-high probability. Therefore (|6.100p and Lemma 1631 imply (j6.95p . and the proof is complete. □ 



7. Universality of generalized Wigner matrices with finite moments 



This section is an application of our results to the problem of universality of generalized Wigner matrices (see 
Definition [7T] below) whose entries have heavy tails. We prove the bulk universality of generalized Wigner 
matrices under the assumption that the matrix entries have a finite m-th moment for some m > 4. We also 
prove the edge universality of Wigner matrices under the assumption that m > 12. (This lower bound can 
in fact be improved to m ^ 7; see Remark 17.51 below.) The Tracy- Widom law for the largest eigenvalue 
of Wigner matrices was first proved in [33] under a Gaussian decay assumption, and was proved later in 
[29l [35l [T9l [24] under various weaker restrictions on the distributions of the matrix elements. In particular, 
in [24] the Tracy- Widom law was proved for entries with symmetric distribution and to > 12. In [23] 
similar results were derived for complex Hermitian Gaussian divisible matrices, where the GUE component 
is of order one. For this case it is proved in [53] that bulk universality holds provided the entries of the 
Wigner component have finite second moments, and edge universality holds provided they have finite fourth 
moments. 

Definition 7.1. We call a Hermitian or real symmetric random matrix H = {hij) a generalized Wigner 
matrix if the two following conditions hold. First, the family of upper-triangluar entries {hij '■ i ^ j) is 
independent. Second, we have 

E/iy =0, E|/iijf = afj, 

where the variances afj satisfy 

E4 = 1' C_ s^inf(7V4) ^ sup(7V4) ^ C+, 

3 
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and < C_ ^ (7+ < oo are constants independent of N . 

Theorem 7.2 (Bulk universality). Suppose that H = (hij) satisfies Definition \ 7.1\ Let m > 4 and 
assume that for all i and j we have 

E\h,,/a,,\"' «c Cm, (7.1) 

for some constant Cm, independent of i, j, and N . 

Let n G N and O : M" — > M 6e compactly supported and continuous. Let E satisfy —2 < E < 2 and let 
e > 0. Then for any sequence bpj satisfying N~^'^'^ ^ ^tv ^ H^-l ~ 2|/2 we have 



1™ / / dai • • -dan 0(ai, 



Here Qsc was defined in (|2.10p . p^' is the n-point marginal of the eigenvalue distribution of H , and p^^j^ 
the n-point marginal of an N x N GUE/GOE matrix. 

Theorem 7.3 (Edge universality). Suppose that = (hjj) and = (/t.^) both satisfy Definition 
Assume that the entries of and all satisfy (j7.1|) for some m > 12, and that the two first two 
moments of the entries of hj; and hj" match: 

E^{K,)\hir ^¥.^{T^)\hT^r for 0^l + u^2. 



Then there is a S > such that for any s CzW we have 



pv|^7V2/3(Ajv - 2) s- A^^-^') -iV^* <; {N^/^{Xn ~ 2) ^ s) ^ F^ (^N^/^{Xn - 2) ^ s + N'^^ + . 

(7.2) 

Here and denote the laws of the ensembles H^ and H^ respectively, and Xn denotes the largest 
eigenvalue of H^ or H^ . 

Remark 7.4. A similar result holds for the smallest eigenvalue Ai. Moreover, a result analogous to (|7.2p 
holds for the n-point joint distribution functions of the extreme eigenvalues. (See [H], Equation (2.40)). 

Remark 7.5. With some additional effort, one may in faet improve the condition m > 12 in Theorem 17.31 
to TO ^ 7. The basic idea is to match seven instead of four moments in Lemma 1 7. 7[ and to use the resolvent 
expansion method from Section [6.31 We omit further details. 

The rest of this section is devoted to the proof of Theorems 17.21 and 17.31 

7.1. Truncation. For definiteness, we focus on real symmetric matrices, but the following truncation argu- 
ment applies trivially to complex Hermitian matrices by truncating the real and imaginary parts separately. 
To simplify the presentation, we consider Wigner matrices for which cry = N~^^'^. The proof for the more 
general matrices from Definition 17.11 is the same; see also Remark 2.4 in [TT] . 

We begin by noting that, without loss of generality, we may assume that the distributions of the entries 
of H are absolutely continuous. Otherwise consider the matrix H + £nV, where V is a, GUE/GOE matrix 
independent of H, and (en) is a positive sequence that tends to zero arbitrarily fast. (Note that the following 
argument is insensitive to the size of sn-) 
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Let H = = [Kfj) be a Wigner matrix whose entries are of the form Kf^ = N ^^'^Xij for some Xij. We 
assume that the family (xij '■ i ^ j) is independent, and that each Xij satisfies 

Exij = 0, ElxyP = 1. 

Moreover, we assume that there is an to > 4 and a constant Cm ^ 1, independent of i, j, and N, such that 

E\x,,r < 

In a first step we construct a truncated Wigner matrix from H^. This truncation is performed in the 
foUowing lemma. 

Lemma 7.6. Fix m > 2 and let X be a real random variable, with absolutely continuous law, satisfying 

EX ^ 0, EX^ = 1, E\X\"' s$ Cm- 

Let A > 0. Then there exists a real random variable Y that satisfies 

EY = 0, EY^ = 1, |F| ^ A, ¥{X:^Y) < 2C„iA"™ . 

Proof. Wc introduce the abbreviations 

P:=P(|X|>A), E := E{X 1{\X\ > \)) , V := EiX"^ 1{\X\ > X)) . 

Using the assumption on X , Markov's inequality, and Holder's inequality, we find 

P s; a„A-™ , \E\ C„,A-"'+i , V < C™A-™+2 . (7.3) 

The idea behind the construction of Y is to cut out the tail \X\ > A, to add appropriate Dirac weights 
at ±A, and to adjust the total probability by cutting out the values X £ [—a, a], where a is an appropriately 
chosen small nonnegative number. For any t satisfying ^ t ^ 1/2, we choose a nonnegative number at such 
that V{X € [—at, at]) = t. Note that since X is absolutely continuous such a number at exists and the map 
t at is continuous. Moreover, using EX'^ ~ 1 and Markov's inequality we find that at ^ 2 for t ^ 1/2. 
We define the quantities 

et E{Xl{-at ^ X ^ at)) , Vt E{XH{-at ^ X ^ at)) , 

which satisfy the trivial bounds 

let I ^ 2t, wt ^ 4i. (7.4) 

We shall remove the values (— oo,— A) U [—at, at] U (A,oo) from the range of X, and replace them with 
Dirac weights at A and —A with respective probabilities p and q. Thus we are led to the system 

p + q = P + t, p-q = X-\E + et), P + q = X'^iV + Vt). (7.5) 

In order to solve (|7.5p . we abbreviate the right-hand sides of the equations in (|7.5p by a{t), (3{t), and 7(t) 
respectively. 

In a first step, we solve t from the equation a{t) = 7(i). To that end, we observe that a(0) ^ 7(0), as 
follows from the trivial inequality V ^ X^P. Moreover, a(l/2) > 7(1/2), by (fO|) and (fTi)) . Since a{t) 
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and j{t) are continuous, the equation a{t) ~ j{t) has a solution t^. Moreover, (|7.3p and (|7.4p imply that 
io ^ CmA"™ + 4A~^toj from which we get that to =^ 2C,„A^™. For the following we fix t t.Q. 
In a second step, we solve the equations p + q = a(t) and p — q~ P{t) to get 

ait) + am - /3(t) 
^= 2 ' 2 • 

We now claim that 5^ a{t). Indeed, a simple application of Cauchy-Schwarz yields |/3(t)| ^ + 

7(i))/2 = a(t). Hence p and q are nonnegative. Moreover, the bounds (|7.3p and (|7.4p yield 

p + g ^ 2C™A-". 

Thus we have proved that (|7.5p has a solution (p, g, t) satisfying 

Q ^ p,q,t ^ 2C™A-™. 

Next, let / := (-00, -A) U [-a*, a*] U (A,oo). Thus, P(X £ /) = p + (?. Partition I = h such that 
P(X e /i) = p and P(X e /a) = g. Then we define 

y := 1^ /) + \i{x e /i) - Ai(x e /a) . 

Recalling (fT^j) . wc find that Y satisfies EF = and Ey^ = i. Moreover, 

P(x r) = p(x e /) = p + g ^ 2a„A-" . 

This concludes the proof. □ 

Note that the variable Y constructed in Lemma [7.61 satisfies E|y|'" ^ 3Cm- Let p be some exponent 
satisfying < p < 1/2 and assume that to > 4. Using Lemma [7.61 with A N^, we construct, for each Xij, 
a random variable yij such that the family {yij ■ i ^ j) is independent and 

Eyy = 0, E4 = 1, |y,,| < iV^ F{x,j ^ y,,) ^ 2CrnN-P^ , E|y.yf" 3C™ . (7.6) 
We define the new matrix = (hfj) through hfj N^^^^ytj. In particular, we have 

E/.r^. = 0, nhy/ = ^, Ei^r^^, (7.7) 

where we set 

q := (7 8) 

Thus, satisfies Definition |2JJ 



7.2. Moment matching. Next, we construct a third Wigner matrix, = {f^ij)i whose entries are of the 
form h^j ~ N~^/'^Zij. We require that Zij have uniformly subexponcntial decay, i.e. 

E^,j = 0, E|z,jf = 1, P(|z,j| ^0 ^ ^"'e^«', (7.9) 
for some 6 > Q independent of i, and A^. We choose Zij so as to match the first four moments of yij. 
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Lemma 7.7. Let yij satisfy (j7.6p with some m > 4. Then there exists a Zij satisfying (j7.9p such that 
Ezl^=Eyl^ for 1 = 1,..., 4. 

Proof. In fact, using an explicit construction similar to the one used in the proof of Theorem 12.51 Zij can 
be chosen to be supported at only three points. We omit further details. □ 

It was proved in [H], Section 2.1, that the statement of Theorem 17.21 holds if the entries of H satisfy the 
subexponential decay condition (|7.9p . Theorem 17.21 will therefore follow if we can prove that, for and O 
as in Theorem 17.21 we have 



E+bN 



dE' 



1™ / / 'i'^i ■ ■ ■ 'i"" 0(ai , . . . , a„) 

where Px"^ ^'^'^ Pz^n n-point marginals of the eigenvalue distributions of and H^, respectively. 

Similarly, it was proved in [19], Section 2.2, that the statement of Theorem 17.31 holds if the entries of 
if"^ and both satisfy the subexponential decay condition (|7.9p . Thus, Theorem 17.31 will follow if we can 
prove that 

V'i^N^/^iXN -2) s^s- N-^'^ -N~^ s$ ¥^{N'^^^{\n -2) s) (^N'^^^{Xn - 2) s + N'^^ + N'^ , 

(7.11) 

for some S > 0. Here we use P'' and P^ to denote the laws of ensembles and respectively. 

We shall prove both (|7.10p and (|7.1ip by first comparing to and then comparing to H^. The 
first step is easy: from (17. 6p we get 

FiH^'^H^) 2C™iV^-''™. (7.12) 

Thus, (|7.10p and (|7.1ip hold with z replaced by y provided that 

pm > 2. (7.13) 

7.3. Comparison of H"^ and H'^ in the bulk. In this section we prove that (|7.10p holds with x replaced by 
y, and hence complete the proof of Theorem 17.31 

We compare the local spectral statistics of H"^ and using the Green function comparison method 
from [T7], Section 8. The key additional input is the local semicircle theorem for sparse matrices, Theorem 
13.31 We merely sketch the differences to [17] . As explained in [17] , the 7i-point correlation functions p"^^ can 
be expressed (up to an error N~'^) in terms of expectations of observables F , whose arguments are products 
of expressions of the form m{zi + vq) where rj N~^~^ . We assume that the first five derivatives of F are 
polynomially bounded, uniformly in N. Using the local semicircle law for sparse matrices, Theorem l3.3[ we 
may control the Green function matrix elements down to scales N~^~^, uniformly in E. (Note that in jl7j . 
the spectral edge had to be excluded since the bounds derived there were unstable near the edge, unlike our 
bound (|3.14p .) This allows us to compare the local eigenvalue statistics of the matrix ensembles at scales 
N~^~^, which is sufficiently accurate for both Theorems 17.21 and 17.31 

We use the telescopic summation and the Lindeberg replacement argument from jl7j . Ghapter 8, whose 
notations we take over without further comment; see also Section [6.31 A resolvent expansion yields 

S = R-N~^l'^RVR + N-^{RVfR-N~^l'^{RVfR + N-'^{RVfR-N-^/'^{RVfS. 
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Next, note that, by (|7.7p and (|7.9p . both ensembles and satisfy Definition 12.11 with q defined in 
(|7.8|) . Hence we may invoke Theorem 13.31 with / = 0, and in particular (|3.14p . for the matrices and H^. 
Choosing ij = N~^~'^ for some e > 0, we therefore get 

\R,j{E + ij^)\ ^ iV2^ \S,,{E + ir^)\ ^ N^' 

with (^, i')-high probability. 

Consider now the difference E'' — applied to some fixed observable F depending on traces (normalized 
by N~^) of resolvents and whose derivatives have at most polynomial growth. Since the first four moments 
of the entries of and H'^ coincide by Lemma 17.71 the error in one step of the telescopic summation is 
bounded by the expectation of the rest term in the resolvent expansion, i.e. 

N^'N-^/'^EiRVfS =^ Af-^/2+^^maxE|146|5 ^ A^-^/^+'^^C^A^" , 

where in the last step we used (|7.6p . The first factor N^^ comes from the polynomially bounded derivatives 
of F. Summing up all 0{N'^) terms of the telescopic sum, we find that the difference E'' — E^ applied to F 
is bounded by 

iV-l/2+Ce+p ^ (7.14) 

Combining (fTTil and (f7rT2|) . we find that both (fTTTOl follows provided that 

-i + Ce + p<0, pm > 2. (7.15) 

Since m > 4 is fixed, choosing first 1/2 — p small enough and then e small enough yields (j7.15p . This 
completes the proof of Theorem 17.21 

7.4. Comparison of and at the edge. In order to prove (|7.2[) under the assumption to > 12, we may 
invoke Proposition l6.41 which implies that (|7.1ip holds with x replaced by y, provided that (j> ~ 1/2 — p > 1/3, 
i.e. p < 1/6. Together with the condition (|7.13p . this implies that (|7.2p holds if m > 12. 



A. Regularization of the Dyson Brownian motion 



In this appendix we sketch a simple regularization argument needed to prove two results concerning the 
Dyson Brownian motion (DBM). This argument can be used as a substitute for earlier, more involved, 
proofs given in Appendices A and B of [16] on the existence of the dynamics restricted to the subdomain 
Eat {x : xi < a;2 < ■ ■ ■ < xn}, and on the applicability of the Bakry-Emery method. The argument 
presented in this section is also more probabilistic in nature than the earlier proofs of |16| . 

For applications in Section 2] of this paper, some minor adjustments to the argument below are needed to 
incorporate the separate treatment of the largest eigenvalue. These modifications are straightforward, and 
we shall only sketch the argument for the standard DBM. 

Theorem A.l. Fix n ^ 1 and let m = (toi, . . . , ?ti„) G N" be an increasing family of indices. Let G : M" — > M 
he a continuous function of compact support and set 

t/j,m(x) := G[N{Xi - Xi+mi),N{xi+rni - a^i+ma), ■ • ■ ,N{Xi+m„_-, - Xi+m„)) ■ 



47 



Let 7i, . . . ,7jv denote the classical locations of the eigenvalues and set 

N 



Q '■= sup ^ / {x, ~ ^,)'^ftdn 



(A.l) 



Choose an e > 0. Then for any p satisfying < p < 1, and setting t = N '\ there exists a t £ [t/2, t] such 
that, for any J C {1, 2, . . . , N — m„ — 1}, we have 



iNQ + 1 



\J\t 



(A.2) 



for all N ^ No{p). Here p = p^^^ is the equilibrium measure of the N eigenvalues of the GOE. 

Define ^/3(x) = Ce~^^^'-^^ as in (|4.6p and (|4.5p . but introducing a parameter /3 so that pp is the 
equihbrium measure of the usual /3-ensemble which is invariant under the (/3-dependent) DBM. We remark 
that Theorem I A. 1 1 holds for all /? ^ 1 with an identical proof. The following lemma holds more generally for 
/? > 0. 

Let u! Cfipe^^^i Uj(xj) ^ where Uj is a C^-function satisfying 



(A.3) 



Lemma A. 2. Let /3 > and q g H^{duj) be a probability density with respect to uj. Then for any /? > and 
any J C {1, 2, . . . , iV — m„ — 1} and any t > we have 



mmU''{x) ^ T-^ 
j 

for some r < 1. For the following lemma we recall the definition (|4.7p of the Dirichlet form. 



/^|:g,.,d.-/±|:c?,^d. 



\j\ 



Recall that the DBM is defined via the stochastic differential equation 



Axi 



1 



2N ^ Xi- X, 



dt 



for i = I, . . 



.,N, 



(A.4) 



(A.5) 



where Bi, . . . , Bn is a family of independent standard Brownian motions. It was proved in [T], Lemma 4.3.3, 
that there is a unique strong solution to (|A.5[) for all /3 ^ 1. 

For any S > define the extension p.^ of the measure /x^ from S^r to by replacing the singular 
logarithm with a C^-function. To that end, we introduce the approximation parameter S > and define, as 
in Section H for x e R^, 



where we set 



log^(a;) := ^ (5) logx + l(a; < J) ( log(5 



2(52 
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It is easy to check that log^ G C^(M), is concave, and satisfies 



Jim logi-(a;) 



logx if a; > 
— oo if X 5j . 



Furthermore, we have the lower bound 



- j2 if X > (5 



Similarly, wc can extend the measure uj to by setting lj^ = Ce '^j^^^^^^. 

Lemma A. 3. Let q e L°°{duj^) be a probability density. Then for 5 ^ 1/7V, /3 > and any J C 
{1, 2, . . . , A'^ — m„ — 1} and any t > we have 



\J\ 



(A.6) 



Proof. The proof of Theorem 4.3 in |16j applies with merely cosmetic changes; now however the dynamics 
is defined on M.^ instead of Ejv, so that complications arising from the boundary are absent. The condition 
S ^ 1/A^ is needed since we use the singularity of S^logcc to generate a factor 1/A^^ in the regime x ^ C/N 
in the proof. □ 



Proof of Lemma IA.21 Suppose that q is a probability density in S^r with respect to u. We extend q to 
be zero outside S and let q^ G be any regularization of q on that converges to q in H^{uj). Then 
there is a constant C^^^ such that := C^^s Qe is a probability density with respect to ujs- Thus (|A.6P holds 
with q replaced by . Taking the limit (5 — > and then e — >■ 0, we have 



C lim lim , / -^"^ (v^) ^ + C lim lim Js^s{q^^ e""*/" . (A.7) 



Notice that ujs — > 1{T,n)^ weakly as (5 — > 0. Thus 



limlim D^s{Jq^) = lim D^{y/q^) = D^{y/q) 

e— s-O S—^Q v e— j-O 



provided that q E H^{lj). This proves Lemma [A. 21 Notice that the proof did not use the existence of DBM; 
instead, it used the existence of the regularized DBM. □ 



Proof of Theorem IA.1I Write 
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Here denotes expectation with respect to the law of the DBM {x{t))t starting from Xo, and E-^"^ denotes 
expectation of xq with respect to the measure /o/i. Let E5 denote expectation with respect to the regularized 
DBM. Then we have 



where we have used the existence of a strong solution to the DBM (see [T], Lemma 4.3.3) and that the 
dynamics remains in Sat almost surely. Hence 



where // is the solution to the regularized DBM at the time t with initial data /om/m'^- Using that (jA.2[) 
holds for the regularized dynamics, and taking the limit J — >■ 0, we complete the proof. □ 
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